Be very careful using vmtouch especially if you call -dl as you could very
easily and quickly kill a system. I've used this tool on cloud VM's to
mitigate cycle times, think DBAN due to public nature of hardware. It's a
fast way to an irked OS thrashing around.

Dick

On Sun, 16 Dec 2018 19:57 Siddhesh Rane <kingsid...@gmail.com wrote:

> I'll be happy to document this. I think FAQ would be a good place.
>
> I actually looked further into this and found that the vmtouch
> functionality is provided in the jdk itself.
> java.nio.MappedByteBuffer#load method will bring file pages in memory [1].
> The way it works is similar to vmtouch, i.e. reading a byte from each page
> to cause page fault and load that page in memory [2].
>
> [1]
>
> https://docs.oracle.com/javase/8/docs/api/java/nio/MappedByteBuffer.html#load--
>
> [2]
>
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/nio/MappedByteBuffer.java#l156
>
>
> On Sun, 16 Dec 2018, 6:59 pm ajs6f <aj...@apache.org wrote:
>
> > This seems to be a Linux-only technique that relies on installing and
> > maintaining vmtouch, correct?
> >
> > It doesn't seem that we could support that as a general solution, but
> > would you be interested in writing something that gives the essentials up
> > for someplace in the Jena docs? I'll admit I'm not sure where it would
> best
> > go, but it might be very helpful to users who can take advantage of it.
> >
> > ajs6f
> >
> > > On Dec 16, 2018, at 6:11 AM, Siddhesh Rane <kingsid...@gmail.com>
> wrote:
> > >
> > > In-memory database has following limitations :
> > >
> > > 1) Time to create the database. Not a problem if you have a dedicated
> > > machine which runs 24/7 where you load data once and the process never
> > > exits. But a huge waste of time if you get hardware during certain time
> > > slots and you have to load data from the start.
> > >
> > > 2) In-memory database is all or nothing. If your dataset can't fit in
> > RAM,
> > > you are out of luck. I had tried using this but many times it would go
> > OOM.
> > > With vmtouch, you can load an index partially, until as much free RAM
> is
> > > available. Something is better than nothing.
> > >
> > > Vmtouch is not doing anything magical. Tdb already uses mmap. When run
> on
> > > its own, Linux will bring most of the index in RAM. But think about the
> > > time it will take for that to happen. If one query takes 50 seconds
> (I've
> > > seen it go to 500-1000s as well), then in 1 hour you would have run
> just
> > 72
> > > queries. If instead your speed was 1s/query you would have executed
> 3600
> > > queries and that would bring more of the index in RAM for future
> queries
> > to
> > > run fast as well. So its also the rate of speedup that matters.
> > > With vmtouch, you vmtouch at the beginning and it gives you a fast head
> > > start and then its your program maintaining the cache.
> > >
> > > Regards,
> > > Siddhesh
> > >
> > >
> > > On Sat, 15 Dec 2018, 9:15 pm ajs6f <aj...@apache.org wrote:
> > >
> > >> What is the advantage to doing that as opposed to using Jena's
> built-in
> > >> in-memory dataset?
> > >>
> > >> ajs6f
> > >>
> > >>> On Dec 15, 2018, at 3:04 AM, Siddhesh Rane <kingsid...@gmail.com>
> > wrote:
> > >>>
> > >>> Bring the entire database in RAM.
> > >>> Use "vmtouch <database location>"
> > >>> Get vmtouch from https://hoytech.com/vmtouch/
> > >>>
> > >>> I had used jena for 150M triples and my performance findings are
> > >> documented
> > >>> at
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/254968eee3cd04370eafa2f9cc586e238f8a7034cf9ab4cbde3dc8e9@%3Cusers.jena.apache.org%3E
> > >>>
> > >>> Regards,
> > >>> Siddhesh
> > >>>
> > >>> On Fri, 7 Dec 2018, 8:23 pm y...@zju.edu.cn <y...@zju.edu.cn wrote:
> > >>>
> > >>>> Dear jena,
> > >>>> I have built a graph with 1.4 billion triples and store it as a data
> > set
> > >>>> in TDB  through Fuseki upload system.
> > >>>> Now, I try to make some sparql search, the speed is very slow.
> > >>>>
> > >>>> For example, when I make the sqarql in Fuseki in the following, it
> > takes
> > >>>> 50 seconds.
> > >>>> How can I improve the speed?
> > >>>> ------------------------------
> > >>>> Best wishes!
> > >>>>
> > >>>>
> > >>>> 胡云苹
> > >>>> 浙江大学控制科学与工程学院
> > >>>> 浙江省杭州市浙大路38号浙大玉泉校区CSC研究所
> > >>>> Institute of Cyber-Systems and Control, College of Control Science
> and
> > >>>> Engineering, Zhejiang University, Hangzhou 310027,P.R.China
> > >>>> Email : y...@zju.edu.cn <y...@iipc.zju.edu.cn>;hyphy...@163.com
> > >>>>
> > >>>>
> > >>
> > >>
> >
> >
>

Reply via email to