Uwe, It's clear to me now - I guess that puts garbage collection out of the picture.
But what is then more confusing - especially if, as you say, Apache Lucene forcefully unmaps all mapped byte buffers when it closes the IndexInputs. So, it must mean that for some reason the IndexInputs are not getting closed. Is there a way to see that? I guess you very clearly outlined these possible causes, which will require code checking: 1) If you do not close IndexWriter and DirectoryReaders when required, the index files stay open. 2) If indexing goes on and you reopen the DirectoryReader (e.g. with the near realtime functions of IndexWriter to see the actual state), be sure to close the "old" reader. Otherwise it will open more and more files. In our case, indexing (actually, re-indexing) happens a lot! The people managing this installation have a need to keep the large index updated. Is there just a kind of fundamental "race condition" that comes from indexing back-to-back? Clearly, fewer rebuilds in a day lessens the danger of machine crash. We can be fairly certain of that. Still, I don't see why it should be necessary to worry about too many index builds. The OS should be able to handle this. I keep coming back to this, though - can this have anything to do with Windows virtual memory management? I kind of specialized way back in college in OS level functions, and Windows has a completely different paradigm that Unix for memory management. Throughout a pretty long career in IT development, I have seen time and time again - including in this case - that when you reboot Windows, the memory problems are gone. I have almost never seen or heard of rebooting Linux or AIX in this regard. That said, I guess that any discussion of ulimits is moot, right? From: "Uwe Schindler" <u...@thetaphi.de> To: <java-user@lucene.apache.org> Date: 02/24/2017 06:22 PM Subject: RE: MappedByteBuffer duplicates Hi, You did not give us all information. So I can only give some hints, because there could be multiple causes for your problems. There is for sure no bug in Apache Lucene as there are thousands of Solr and Elasticsearch instances running without such problems. > Actually, at a certain point, they have crashed the machine. The native > file mappings are deallocated (unmapped) by the JVM when the > MappedByteBuffers are eligible for garbage collection. The problem we're > seeing is that there are thousands of MappedByteBuffers which are not > eligible for garbage collection. The native memory is retained because the > Lucene code is still referencing the MappedByteBuffer objects on the Java > heap. This isn't the fault of Windows or the JVM. It appears to be a fault > in Lucen, but we can't diagnose it - we can't see why the MappedByteBuffer > objects are being retained. For Apache Lucene this is not true: Apache Lucene forcefully unmaps all mapped byte buffers when it closes the IndexInputs. Without that, we would need to wait for Garbage Collection for this to happen, which not only brings problems for virtual address space (your problem), but also disk usage (files that have mapped contents cannot be deleted). So your statement is not true. Lucene does not need to wait for Garbage Collector, it forces unmapping! If forceful unmapping does not work (requires Oracle JDK, OpenJDK or IBM J9 - version [7 for Lucene 5], Java 8, Java 9 b150+), MMapDirectory is not used by default. This happens on JVMs which do not expose the internal APIs that are needed to do that. To check this, print the contents of: http://lucene.apache.org/core/6_4_1/core/org/apache/lucene/store/MMapDirectory.html#UNMAP_SUPPORTED http://lucene.apache.org/core/6_4_1/core/org/apache/lucene/store/MMapDirectory.html#UNMAP_NOT_SUPPORTED_REASON If you use FSDirectory.open() to get a directory instance (factory method), it will not choose MMapDir if unmapping is not supported. So It may happen that you forcefully use MMapDirectory, although unmapping does not work for your JVM? Nevertheless, you say that you see many MappedByteBuffers that are not eligible for garbage collection. Of course Lucene will not unmap those because they are still in use. The reason for this could be incorrect code on your side. If you do not close IndexWriter and DirectoryReaders when required, the index files stay open. If indexing goes on and you reopen the DirectoryReader (e.g. with the near realtime functions of IndexWriter to see the actual state), be sure to close the "old" reader. Otherwise it will open more and more files. Depending on maximum open files limit, you can run out of file handles or (if you have many file handles) or it may crush the machine, because you use all virtual address space. To fully analyze your problem, we need more information. Please also provide: - Lucene version - Operating System version - "ulimit -a" output (POSIX operating systems) - Java version and vendor - Crash report - Source code to show what you are doing: Just indexing (your problem is impossible), indexing and searching in parallel, do your use NRT readers for realtime visibility of indexed content Uwe > From: "Uwe Schindler" <u...@thetaphi.de> > To: <java-user@lucene.apache.org> > Date: 02/24/2017 01:39 PM > Subject: RE: MappedByteBuffer duplicates > > > > Hi, > > that is not an issue, the duplicates are required for so called IndexInput > clones and splices. Every search request will create many of them. But > there is no need to worry, they are just thin wrappers - they don't > allocate any extra off-heap memory. They are just there to have a separate > position(), limit() and other settings for each searcher thread. > > Why do you worry? > Uwe > > ----- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > -----Original Message----- > > From: Kameron Cole [mailto:kameronc...@us.ibm.com] > > Sent: Friday, February 24, 2017 7:19 PM > > To: java-user@lucene.apache.org > > Subject: MappedByteBuffer duplicates > > > > We have a Lucene engine that creates MappedByteBuffer objects when > > creating the Lucene index. I don't know Lucene well enough to know if > > this standard behavior. > > > > The mapped files are being created by Lucene, via the JRE's NIO APIs > > native file mapping underneath each MappedByteBuffer object. We see an > > issue where duplicate MappedByteBuffer objects are being created. Has > > anyone seen this? > > > > Thank you! > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org