Thanks, Uwe Yes, recommended, tmpfs/ramfs worked like a charm in our use-case with a read-only index, giving us very high-throughput and consistent response time on queries.
We had to have some redundancy to be built around that service to be high-available, so we can do a rolling update on the read-only index reducing the risk of downtime. On Mon, Dec 14, 2020 at 1:51 PM Uwe Schindler <u...@thetaphi.de> wrote: > Hi, > > as writer of the original bog post, here my comments: > > Yes, MMapDirectory.setPreload() is the feature mentioned in my blog post is > to load everything into memory - but that does not guarantee anything! > Still, I would not recommend to use that function, because all it does is > to > just touch every page of the file, so the linux kernel puts it into OS > cache > - nothing more; IMHO very ineffective as it slows down openining index for > a > stupid for-each-page-touch-loop. It will do this with EVERY page, if it is > later used or not! So this may take some time until it is done. Lateron, > still Lucene needs to open index files, initialize its own data > structures,... > > In general it is much better to open index, with MMAP directory and execute > some "sample" queries. This will do exactly the same like the preload > function, but it is more "selective". Parts of the index which are not used > won't be touched, and on top, it will also load ALL the required index > structures to heap. > > As always and as mentioned in my blog post: there's nothing that can ensure > your index will stays in memory. Please trust the kernel to do the right > thing. Why do you care at all? > > If you are curious and want to have everything in memory all the time: > - use tmpfs as your filesystem (of course you will loose data when OS shuts > down) > - disable swap and/or disable swapiness > - use only as much heap as needed, keep everything of free memory for your > index outside heap. > > Fake feelings of "everything in RAM" are misconceptions like: > - use RAMDirectory (deprecated): this may be a desaster as it described in > the blog post > - use ByteBuffersDirectory: a little bit better, but this brings nothing, > as > the operating system kernel may still page out your index pages. They still > live in/off heap and are part of usual paging. They are just no longer > backed by a file. > > Lucene does most of the stuff outside heap, live with it! > > Uwe > > ----- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message----- > > From: baris.ka...@oracle.com <baris.ka...@oracle.com> > > Sent: Sunday, December 13, 2020 10:18 PM > > To: java-user@lucene.apache.org > > Cc: BARIS KAZAR <baris.ka...@oracle.com> > > Subject: MMapDirectory vs In Memory Lucene Index (i.e., > ByteBuffersDirectory) > > > > Hi,- > > > > it would be nice to create a Lucene index in files and then effectively > load it > > into memory once (since i use in read-only mode). I am looking into if > this is > > doable in Lucene. > > > > i wish there were an option to load whole Lucene index into memory: > > > > Both of below urls have links to the blog url where i quoted a very nice > section: > > > > https://lucene.apache.org/core/8_5_0/core/org/apache/lucene/store/MMapDi > > rectory.html > > https://lucene.apache.org/core/8_5_2/core/org/apache/lucene/store/MMapDi > > rectory.html > > > > This following blog mentions about such option > > to run in the memory: (see the underlined sentence below) > > > > https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on- > > 64bit.html?m=1 > > > > MMapDirectory will not load the whole index into physical memory. Why > > should it do this? We just ask the operating system to map the file into > address > > space for easy access, by no means we are requesting more. Java and the > O/S > > optionally provide the option to try loading the whole file into RAM (if > enough > > is available), but Lucene does not use that option (we may add this > possibility > > in a later version). > > > > My question is: is there such an option? > > is the method setPreLoad for this purpose: > > to load all Lucene lndex into memory? > > > > I would like to use MMapDirectory and set my > > JVM heap to 16G or a bit less (since my index is > > around this much). > > > > The Lucene 8.5.2 (8.5.0 as well) javadocs say: > > public void setPreload(boolean preload) > > Set to true to ask mapped pages to be loaded into physical memory on > init. > The > > behavior is best-effort and operating system dependent. > > > > For example Lucene 4.0.0 does not have setPreLoad method. > > > > https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/store/MMapDi > > rectory.html > > > > Happy Holidays > > Best regards > > > > > > Ps. i know there is also BytesBuffersDirectory class for in memory Lucene > but > > this requires creating Lucene Index on the fly. > > > > This is great for only such kind of Lucene indexes that can be created > quickly on > > the fly. > > > > Ekaterina has a nice article on this BytesBuffersDirectory class: > > > > https://medium.com/@ekaterinamihailova/in-memory-search-and- > > autocomplete-with-lucene-8-5-f2df1bc71c36 > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >