RE: MappedByteBuffer duplicates

Uwe Schindler Fri, 24 Feb 2017 15:22:19 -0800

Hi,

You did not give us all information. So I can only give some hints, because 
there could be multiple causes for your problems. There is for sure no bug in 
Apache Lucene as there are thousands of Solr and Elasticsearch instances 
running without such problems.


> Actually, at a certain point, they have crashed the machine. The native
> file mappings are deallocated (unmapped) by the JVM when the
> MappedByteBuffers are eligible for garbage collection. The problem we're
> seeing  is that there are thousands of MappedByteBuffers which are not
> eligible for garbage collection. The native memory is retained because the
> Lucene code is still referencing the MappedByteBuffer objects on the Java
> heap. This isn't the fault of Windows or the JVM. It appears to be a fault
> in Lucen, but we can't diagnose it - we can't see why the MappedByteBuffer
> objects are being retained.

For Apache Lucene this is not true:

Apache Lucene forcefully unmaps all mapped byte buffers when it closes the 
IndexInputs. Without that, we would need to wait for Garbage Collection for 
this to happen, which not only brings problems for virtual address space (your 
problem), but also disk usage (files that have mapped contents cannot be 
deleted). So your statement is not true. Lucene does not need to wait for 
Garbage Collector, it forces unmapping!

If forceful unmapping does not work (requires Oracle JDK, OpenJDK or IBM J9 - 
version [7 for Lucene 5], Java 8, Java 9 b150+), MMapDirectory is not used by 
default. This happens on JVMs which do not expose the internal APIs that are 
needed to do that. To check this, print the contents of:

http://lucene.apache.org/core/6_4_1/core/org/apache/lucene/store/MMapDirectory.html#UNMAP_SUPPORTED
http://lucene.apache.org/core/6_4_1/core/org/apache/lucene/store/MMapDirectory.html#UNMAP_NOT_SUPPORTED_REASON

If you use FSDirectory.open() to get a directory instance (factory method), it 
will not choose MMapDir if unmapping is not supported. So It may happen that 
you forcefully use MMapDirectory, although unmapping does not work for your JVM?

Nevertheless, you say that you see many MappedByteBuffers that are not eligible 
for garbage collection. Of course Lucene will not unmap those because they are 
still in use. The reason for this could be incorrect code on your side. If you 
do not close IndexWriter and DirectoryReaders when required, the index files 
stay open. If indexing goes on and you reopen the DirectoryReader (e.g. with 
the near realtime functions of IndexWriter to see the actual state), be sure to 
close the "old" reader. Otherwise it will open more and more files. Depending 
on maximum open files limit, you can run out of file handles or (if you have 
many file handles) or it may crush the machine, because you use all virtual 
address space.

To fully analyze your problem, we need more information. Please also provide:
- Lucene version
- Operating System version
- "ulimit -a" output (POSIX operating systems)
- Java version and vendor
- Crash report
- Source code to show what you are doing: Just indexing (your problem is 
impossible), indexing and searching in parallel, do your use NRT readers for 
realtime visibility of indexed content 

Uwe

> From:   "Uwe Schindler" <[email protected]>
> To:     <[email protected]>
> Date:   02/24/2017 01:39 PM
> Subject:        RE: MappedByteBuffer duplicates
> 
> 
> 
> Hi,
> 
> that is not an issue, the duplicates are required for so called IndexInput
> clones and splices. Every search request will create many of them. But
> there is no need to worry, they are just thin wrappers - they don't
> allocate any extra off-heap memory. They are just there to have a separate
> position(), limit() and other settings for each searcher thread.
> 
> Why do you worry?
> Uwe
> 
> -----
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> http://www.thetaphi.de
> eMail: [email protected]
> 
> > -----Original Message-----
> > From: Kameron Cole [mailto:[email protected]]
> > Sent: Friday, February 24, 2017 7:19 PM
> > To: [email protected]
> > Subject: MappedByteBuffer duplicates
> >
> > We have a Lucene engine that creates MappedByteBuffer objects when
> > creating the Lucene index.  I don't know Lucene well enough to know if
> > this standard behavior.
> >
> > The mapped files are being created by Lucene, via the JRE's NIO APIs
> > native file mapping underneath each MappedByteBuffer object. We see an
> > issue where duplicate MappedByteBuffer objects are being created.  Has
> > anyone seen this?
> >
> > Thank you!
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: MappedByteBuffer duplicates

Reply via email to