Hi, In general, the chunk size should be as large as possible. It merely only exists for 32 bit environments, to work around the limited address space, where fragmentation causes issues earlier. With 64 bit operating systems, fragmentation of address space is also an issue, but only if your total size of all indexes is like several terabytes. Keep in mind, that smaller chunk sizes cause more work for lucene, because it is more likely that a random read hits another mapped region than the current one, so it has to switch buffers (which is done through ByteBuffer's exception handling). The maximum size of 1 GiB is caused by the maximum size of ByteBuffers in the JVM: They have 32 bit signed offsets only, so only 2 GiB - 1 Byte maximum capacity -> rounded down to next power of 2, this is 1 GiB.
We don't know anything about how the kernel of your OS assigns address space, but why do you think a "higher" address space is better for a larger index? Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Gaurav gupta [mailto:gupta.gaurav0...@gmail.com] > Sent: Saturday, September 27, 2014 5:20 AM > To: java-user@lucene.apache.org > Subject: Re: Optimum Lucene’s MMapDirectory size on 64bit OS > > Thanks Uwe for the insight ! > > Also, is it advisable to set the lower chunk size for smaller indexes, like > below > or let Lucene/OS manage by itself. I am just guessing that assigning lower > value to smaller index will make sure that bigger index are getting higher > mmap address space. > > *Index Name Total Records Size (in GB) What should be the max. or > optimal chunk size ?* > Address index 106,192,963.00 65 1 GiB > Name index 97,924,594.00 44 1 GiB > GovtId index 81,178,958.00 11 512 MB > Phone index 169,691,376.00 14 512 MB > Email index 46,602,090.00 5 256 MB > Date index 77,243,714.00 6.5 256 MB > > Thanks > > On Sat, Sep 27, 2014 at 3:40 AM, Uwe Schindler <u...@thetaphi.de> wrote: > > > Hi, > > > > 1 GiB is the maximum possible. The chunk size is only applicable for > > 32 bit JDKs because of limited address space. > > > > Uwe > > > > ----- > > Uwe Schindler > > H.-H.-Meier-Allee 63, D-28213 Bremen > > http://www.thetaphi.de > > eMail: u...@thetaphi.de > > > > > > > -----Original Message----- > > > From: Gaurav gupta [mailto:gupta.gaurav0...@gmail.com] > > > Sent: Friday, September 26, 2014 9:12 PM > > > To: java-user@lucene.apache.org > > > Subject: Optimum Lucene’s MMapDirectory size on 64bit OS > > > > > > Hi, > > > > > > As per the post "The Generics Policeman Blog > > > <http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on- > > > 64bit.html>" > > > , I am using the MMapDirectory for faster access(search and update > > > operations ,mainly search) of Lucene 4.8.1 index files. I am > > contemplating > > > what is the optimal maximum MMap value for my indexes. Is default > > > i.e. 1 GB > > > (1 << 30) or higher? > > > > > > I have 6 indexes of size varying from 65GB to 6 GB. Currently, I am > > using 1 GB > > > as maxChunkSize : - *MMapDirectory(file, null, 1<<30) *for all indexes. > > > But thinking of specifying the higher value for mmap (1 GB or > > > higher) for bigger index having 65GB size and lower value (0.5 GB or > > > less) for > > smaller > > > index having size of 6 GB. Any suggestion/guidance on it ? > > > > > > Also, per blog mmap is not a size of physical memory allocation but > > > just > > a > > > address space to map the index files. How to allocate more RAM to > > > index files for better performance? We have enough RAM free out of > > > 64 GB. Per blog, one should use the mmap file, like - > > > *MMapDirectory(file, null, > > > 1<<30) *and let OS manage the physical memory allocation for the > > > index files. Is my understanding correct ? > > > > > > The Generics Policeman Blog > > > <http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on- > > > 64bit.html> :- > > > > > > - *MMapDirectory does not consume additional memory and the size of > > > mapped index files is not limited by the physical memory > > > available on > > your > > > server.* By mmap() files, we only reserve address space not memory! > > > Remember, address space on 64bit platforms is for free! > > > > > > Thanks > > > Gaurav > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org