Hi,

In general, the chunk size should be as large as possible. It merely only 
exists for 32 bit environments, to work around the limited address space, where 
fragmentation causes issues earlier. With 64 bit operating systems, 
fragmentation of address space is also an issue, but only if your total size of 
all indexes is like several terabytes. Keep in mind, that smaller chunk sizes 
cause more work for lucene, because it is more likely that a random read hits 
another mapped region than the current one, so it has to switch buffers (which 
is done through ByteBuffer's exception handling). The maximum size of 1 GiB is 
caused by the maximum size of ByteBuffers in the JVM: They have 32 bit signed 
offsets only, so only 2 GiB - 1 Byte maximum capacity -> rounded down to next 
power of 2, this is 1 GiB.

We don't know anything about how the kernel of your OS assigns address space, 
but why do you think a "higher" address space is better for a larger index?

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Gaurav gupta [mailto:gupta.gaurav0...@gmail.com]
> Sent: Saturday, September 27, 2014 5:20 AM
> To: java-user@lucene.apache.org
> Subject: Re: Optimum Lucene’s MMapDirectory size on 64bit OS
> 
> Thanks Uwe for the insight !
> 
> Also, is it advisable to set the lower chunk size for smaller indexes, like 
> below
> or let Lucene/OS manage by itself. I am just guessing that assigning lower
> value to smaller index will make sure that bigger index are getting higher
> mmap address space.
> 
> *Index Name  Total Records Size (in GB)   What should be the max. or
> optimal chunk size ?*
> Address index  106,192,963.00    65             1 GiB
> Name index     97,924,594.00      44             1 GiB
> GovtId index   81,178,958.00       11              512 MB
> Phone index    169,691,376.00    14              512 MB
> Email index    46,602,090.00        5               256 MB
> Date index     77,243,714.00        6.5             256 MB
> 
> Thanks
> 
> On Sat, Sep 27, 2014 at 3:40 AM, Uwe Schindler <u...@thetaphi.de> wrote:
> 
> > Hi,
> >
> > 1 GiB is the maximum possible. The chunk size is only applicable for
> > 32 bit JDKs because of limited address space.
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Gaurav gupta [mailto:gupta.gaurav0...@gmail.com]
> > > Sent: Friday, September 26, 2014 9:12 PM
> > > To: java-user@lucene.apache.org
> > > Subject: Optimum Lucene’s MMapDirectory size on 64bit OS
> > >
> > > Hi,
> > >
> > > As per the post "The Generics Policeman Blog
> > > <http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
> > > 64bit.html>"
> > > , I am using the MMapDirectory for faster access(search and update
> > > operations ,mainly search) of Lucene 4.8.1 index files. I am
> > contemplating
> > > what is the optimal maximum MMap value for my indexes. Is default
> > > i.e. 1 GB
> > > (1 << 30) or higher?
> > >
> > > I have 6 indexes of size varying from 65GB to 6 GB. Currently, I am
> > using 1 GB
> > > as maxChunkSize : - *MMapDirectory(file, null, 1<<30) *for all indexes.
> > > But thinking of specifying the higher value for mmap (1 GB or
> > > higher) for bigger index having 65GB size and lower value (0.5 GB or
> > > less) for
> > smaller
> > > index having size of 6 GB. Any suggestion/guidance on it ?
> > >
> > > Also, per blog mmap is not a size of physical memory allocation but
> > > just
> > a
> > > address space to map the index files. How to allocate more RAM to
> > > index files for better performance? We have enough RAM free out of
> > > 64 GB. Per blog, one should use the  mmap file, like -
> > > *MMapDirectory(file, null,
> > > 1<<30) *and let OS manage the physical memory allocation for the
> > > index files. Is my understanding correct ?
> > >
> > > The Generics Policeman Blog
> > > <http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-
> > > 64bit.html> :-
> > >
> > >    - *MMapDirectory does not consume additional memory and the size of
> > >    mapped index files is not limited by the physical memory
> > > available on
> > your
> > >    server.* By mmap() files, we only reserve address space not memory!
> > >    Remember, address space on 64bit platforms is for free!
> > >
> > > Thanks
> > > Gaurav
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to