Be very careful using vmtouch especially if you call -dl as you could very easily and quickly kill a system. I've used this tool on cloud VM's to mitigate cycle times, think DBAN due to public nature of hardware. It's a fast way to an irked OS thrashing around.
Dick On Sun, 16 Dec 2018 19:57 Siddhesh Rane <kingsid...@gmail.com wrote: > I'll be happy to document this. I think FAQ would be a good place. > > I actually looked further into this and found that the vmtouch > functionality is provided in the jdk itself. > java.nio.MappedByteBuffer#load method will bring file pages in memory [1]. > The way it works is similar to vmtouch, i.e. reading a byte from each page > to cause page fault and load that page in memory [2]. > > [1] > > https://docs.oracle.com/javase/8/docs/api/java/nio/MappedByteBuffer.html#load-- > > [2] > > http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/nio/MappedByteBuffer.java#l156 > > > On Sun, 16 Dec 2018, 6:59 pm ajs6f <aj...@apache.org wrote: > > > This seems to be a Linux-only technique that relies on installing and > > maintaining vmtouch, correct? > > > > It doesn't seem that we could support that as a general solution, but > > would you be interested in writing something that gives the essentials up > > for someplace in the Jena docs? I'll admit I'm not sure where it would > best > > go, but it might be very helpful to users who can take advantage of it. > > > > ajs6f > > > > > On Dec 16, 2018, at 6:11 AM, Siddhesh Rane <kingsid...@gmail.com> > wrote: > > > > > > In-memory database has following limitations : > > > > > > 1) Time to create the database. Not a problem if you have a dedicated > > > machine which runs 24/7 where you load data once and the process never > > > exits. But a huge waste of time if you get hardware during certain time > > > slots and you have to load data from the start. > > > > > > 2) In-memory database is all or nothing. If your dataset can't fit in > > RAM, > > > you are out of luck. I had tried using this but many times it would go > > OOM. > > > With vmtouch, you can load an index partially, until as much free RAM > is > > > available. Something is better than nothing. > > > > > > Vmtouch is not doing anything magical. Tdb already uses mmap. When run > on > > > its own, Linux will bring most of the index in RAM. But think about the > > > time it will take for that to happen. If one query takes 50 seconds > (I've > > > seen it go to 500-1000s as well), then in 1 hour you would have run > just > > 72 > > > queries. If instead your speed was 1s/query you would have executed > 3600 > > > queries and that would bring more of the index in RAM for future > queries > > to > > > run fast as well. So its also the rate of speedup that matters. > > > With vmtouch, you vmtouch at the beginning and it gives you a fast head > > > start and then its your program maintaining the cache. > > > > > > Regards, > > > Siddhesh > > > > > > > > > On Sat, 15 Dec 2018, 9:15 pm ajs6f <aj...@apache.org wrote: > > > > > >> What is the advantage to doing that as opposed to using Jena's > built-in > > >> in-memory dataset? > > >> > > >> ajs6f > > >> > > >>> On Dec 15, 2018, at 3:04 AM, Siddhesh Rane <kingsid...@gmail.com> > > wrote: > > >>> > > >>> Bring the entire database in RAM. > > >>> Use "vmtouch <database location>" > > >>> Get vmtouch from https://hoytech.com/vmtouch/ > > >>> > > >>> I had used jena for 150M triples and my performance findings are > > >> documented > > >>> at > > >>> > > >> > > > https://lists.apache.org/thread.html/254968eee3cd04370eafa2f9cc586e238f8a7034cf9ab4cbde3dc8e9@%3Cusers.jena.apache.org%3E > > >>> > > >>> Regards, > > >>> Siddhesh > > >>> > > >>> On Fri, 7 Dec 2018, 8:23 pm y...@zju.edu.cn <y...@zju.edu.cn wrote: > > >>> > > >>>> Dear jena, > > >>>> I have built a graph with 1.4 billion triples and store it as a data > > set > > >>>> in TDB through Fuseki upload system. > > >>>> Now, I try to make some sparql search, the speed is very slow. > > >>>> > > >>>> For example, when I make the sqarql in Fuseki in the following, it > > takes > > >>>> 50 seconds. > > >>>> How can I improve the speed? > > >>>> ------------------------------ > > >>>> Best wishes! > > >>>> > > >>>> > > >>>> 胡云苹 > > >>>> 浙江大学控制科学与工程学院 > > >>>> 浙江省杭州市浙大路38号浙大玉泉校区CSC研究所 > > >>>> Institute of Cyber-Systems and Control, College of Control Science > and > > >>>> Engineering, Zhejiang University, Hangzhou 310027,P.R.China > > >>>> Email : y...@zju.edu.cn <y...@iipc.zju.edu.cn>;hyphy...@163.com > > >>>> > > >>>> > > >> > > >> > > > > >