So with MMapDir at defaults (unmap is enabled) you see old files, with no open file handles as reported by lsof, still existing in your index directory, taking lots of space.
But with NIOFSDirectory the issue doesn't happen? Are you sure? I'll look at the 6.6 GB infoStream to see what it says about the ref counts. Did you fix the issue in your app where you're not closing all opened NRT readers? Mike McCandless http://blog.mikemccandless.com On Fri, Nov 13, 2015 at 12:22 PM, Rob Audenaerde <rob.audenae...@gmail.com> wrote: > I haven't disabled unmapping, and I am running out-of-the-box > FSDirectory.open(). As I can see it tries to pick MMap. For the test I > explicitly constructed a NIOFSDIrectoryReader > > OS is (from the top of my head) CentOS 6.x, Java 1.8.0u33. I can check > later for more details. > On Nov 13, 2015 18:07, "Uwe Schindler" <u...@thetaphi.de> wrote: > >> Hi, >> >> Lucene has the workaround, so it should not happen, UNLESS you explicitly >> disable the hack using MMapDirectory#setEnableUnmap(false). >> >> Uwe >> >> ----- >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >> > -----Original Message----- >> > From: will martin [mailto:wmartin...@gmail.com] >> > Sent: Friday, November 13, 2015 6:04 PM >> > To: java-user@lucene.apache.org >> > Subject: Re: debugging growing index size >> > >> > Hi Rob: >> > >> > >> > Doesn’t this look like known SE issue JDK-4724038 and discussed by Peter >> > Levart and Uwe Schindler on a lucene-dev thread 9/9/2015? >> > >> > MappedByteBuffer …. what OS are you on Rob? What JVM? >> > >> > http://bugs.java.com/view_bug.do?bug_id=4724038 >> > >> > http://mail-archives.apache.org/mod_mbox/lucene- >> > dev/201509.mbox/%3c55f0461a.2070...@gmail.com%3E >> > >> > hth >> > -will >> > >> > >> > >> > > On Nov 13, 2015, at 11:23 AM, Rob Audenaerde >> > <rob.audenae...@gmail.com> wrote: >> > > >> > > I'm currently running using NIOFS. It seems to prevent the issue from >> > > appearing. >> > > >> > > This is a second run (with applied deletes etc) >> > > >> > > raudenaerd@:/<6>index/index$sudo ls -lSra *.dvd >> > > -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 _y_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12 _xod_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 53699972 Nov 13 17:17 _110e_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 112855516 Nov 13 17:19 _12r5_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 151149886 Nov 13 17:13 _y0s_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 222062059 Nov 13 17:17 _z20_Lucene50_0.dvd >> > > >> > > raudenaerde:/<6>index/index$sudo ls -lSaa *.dvd >> > > -rw-r--r--. 1 apache apache 222062059 Nov 13 17:17 _z20_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 151149886 Nov 13 17:13 _y0s_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 112855516 Nov 13 17:19 _12r5_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 53699972 Nov 13 17:17 _110e_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 39048886 Nov 13 17:12 _xod_Lucene50_0.dvd >> > > -rw-r--r--. 1 apache apache 7993 Nov 13 16:09 _y_Lucene50_0.dvd >> > > >> > > >> > > >> > > On Thu, Nov 12, 2015 at 3:40 PM, Michael McCandless < >> > > luc...@mikemccandless.com> wrote: >> > > >> > >> Hi Rob, >> > >> >> > >> A couple more things: >> > >> >> > >> Can you print the value of MMapDirectory.UNMAP_SUPPORTED? >> > >> >> > >> Also, can you try your test using NIOFSDirectory instead? Curious if >> > >> that changes things... >> > >> >> > >> Mike McCandless >> > >> >> > >> http://blog.mikemccandless.com >> > >> >> > >> >> > >> On Thu, Nov 12, 2015 at 7:28 AM, Rob Audenaerde >> > >> <rob.audenae...@gmail.com> wrote: >> > >>> Curious indeed! >> > >>> >> > >>> I will turn on the IndexFileDeleter.VERBOSE_REF_COUNTS and recreate >> > the >> > >>> logs. Will get back with them in a day hopefully. >> > >>> >> > >>> Thanks for the extra logging! >> > >>> >> > >>> -Rob >> > >>> >> > >>> On Thu, Nov 12, 2015 at 11:34 AM, Michael McCandless < >> > >>> luc...@mikemccandless.com> wrote: >> > >>> >> > >>>> Hmm, curious. >> > >>>> >> > >>>> I looked at the [large] infoStream output and I see segment _3ou7 >> > >>>> present on init of IW, a few getReader calls referencing it, then a >> > >>>> forceMerge that indeed merges it away, yet I do NOT see IW >> > attempting >> > >>>> deletion of its files. >> > >>>> >> > >>>> And indeed I see plenty (too many: many times per second?) of >> > commits >> > >>>> after that, so the index itself is no longer referencing _3ou7. >> > >>>> >> > >>>> If you are failing to close all NRT readers then I would expect >> _3ou7 >> > >>>> to be in the lsof output, but it's not. >> > >>>> >> > >>>> The NRT readers close method has logic that notifies IndexWriter >> when >> > >>>> it's done "needing" the files, to emulate "delete on last close" >> > >>>> semantics for filesystems like HDFS that don't do that ... it's >> > >>>> possible something is wrong here. >> > >>>> >> > >>>> Can you set the (public, static) boolean >> > >>>> IndexFileDeleter.VERBOSE_REF_COUNTS to true, and then re- >> > generate this >> > >>>> log? This causes IW to log the ref count of each file it's tracking >> > >>>> ... >> > >>>> >> > >>>> I'll also add a bit more verbosity to IW when NRT readers are opened >> > >>>> and close, for 5.4.0. >> > >>>> >> > >>>> Mike McCandless >> > >>>> >> > >>>> http://blog.mikemccandless.com >> > >>>> >> > >>>> >> > >>>> On Wed, Nov 11, 2015 at 6:09 AM, Rob Audenaerde >> > >>>> <rob.audenae...@gmail.com> wrote: >> > >>>>> Hi all, >> > >>>>> >> > >>>>> I'm still debugging the growing-index size. I think closing index >> > >> readers >> > >>>>> might help (work in progress), but I can't really see them holding >> on >> > >> to >> > >>>>> files (at least, using lsof ). Restarting the application sheds >> some >> > >>>> light, >> > >>>>> I see logging on files that are no longer referenced. >> > >>>>> >> > >>>>> What I see is that there are files in the index-directory, that >> seem >> > >> to >> > >>>>> longer referenced.. >> > >>>>> >> > >>>>> I put the output of the infoStream online, because is it rather big >> > >> (30MB >> > >>>>> gzipped): http://www.audenaerde.org/lucene/merges.log.gz >> > >>>>> >> > >>>>> Output of lsof: (executed 'sudo lsof *' in the index directory ). >> > >> This >> > >>>> is >> > >>>>> on an CentOS box (maybe that influences stuff as well?) >> > >>>>> >> > >>>>> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME >> > >>>>> java 30581 apache mem REG 253,0 3176094924 18880508 >> > >>>>> _4gs5_Lucene50_0.dvd >> > >>>>> java 30581 apache mem REG 253,0 505758610 18880546 >> _4gs5.fdt >> > >>>>> java 30581 apache mem REG 253,0 369563337 18880631 >> > >>>>> _4gs5_Lucene50_0.tim >> > >>>>> java 30581 apache mem REG 253,0 176344058 18880623 >> > >>>>> _4gs5_Lucene50_0.pos >> > >>>>> java 30581 apache mem REG 253,0 378055201 18880606 >> > >>>>> _4gs5_Lucene50_0.doc >> > >>>>> java 30581 apache mem REG 253,0 372579599 18880400 >> > >>>>> _4i5a_Lucene50_0.dvd >> > >>>>> java 30581 apache mem REG 253,0 82017447 18880748 >> _4g37.cfs >> > >>>>> java 30581 apache mem REG 253,0 85376507 18880721 >> _4fb3.cfs >> > >>>>> java 30581 apache mem REG 253,0 363493917 18880533 >> > >>>>> _4ct1_Lucene50_0.dvd >> > >>>>> java 30581 apache mem REG 253,0 9421892 18880806 >> _4gjc.cfs >> > >>>>> java 30581 apache mem REG 253,0 76877461 18880553 >> _4ct1.fdt >> > >>>>> java 30581 apache mem REG 253,0 46271330 18880661 >> > >>>>> _4ct1_Lucene50_0.tim >> > >>>>> java 30581 apache mem REG 253,0 26911387 18880653 >> > >>>>> _4ct1_Lucene50_0.pos >> > >>>>> java 30581 apache mem REG 253,0 54678249 18880568 >> > >>>>> _4ct1_Lucene50_0.doc >> > >>>>> java 30581 apache mem REG 253,0 76556587 18880328 >> _4i5a.fdt >> > >>>>> java 30581 apache mem REG 253,0 45032159 18880389 >> > >>>>> _4i5a_Lucene50_0.tim >> > >>>>> java 30581 apache mem REG 253,0 26486772 18880388 >> > >>>>> _4i5a_Lucene50_0.pos >> > >>>>> java 30581 apache mem REG 253,0 55411002 18880362 >> > >>>>> _4i5a_Lucene50_0.doc >> > >>>>> java 30581 apache mem REG 253,0 70484185 18880340 >> _4hkn.cfs >> > >>>>> java 30581 apache mem REG 253,0 10873921 18880324 >> _4gpz.cfs >> > >>>>> java 30581 apache mem REG 253,0 17230506 18880524 >> _4i11.cfs >> > >>>>> java 30581 apache mem REG 253,0 6706969 18880575 >> _4i0t.cfs >> > >>>>> java 30581 apache mem REG 253,0 15135578 18880624 >> _4i0i.cfs >> > >>>>> java 30581 apache mem REG 253,0 15368310 18880717 >> _4hzp.cfs >> > >>>>> java 30581 apache mem REG 253,0 5146140 18880583 >> _4hze.cfs >> > >>>>> java 30581 apache mem REG 253,0 2917380 18880411 >> _4gs5.nvd >> > >>>>> java 30581 apache mem REG 253,0 6871469 18880732 >> _4hod.cfs >> > >>>>> java 30581 apache mem REG 253,0 2860341 18880495 >> _4i84.cfs >> > >>>>> java 30581 apache mem REG 253,0 835726 18880660 >> _4i7z.cfs >> > >>>>> java 30581 apache mem REG 253,0 1005595 18880648 >> _4i7w.cfs >> > >>>>> java 30581 apache mem REG 253,0 5639672 18880401 >> _4i4o.cfs >> > >>>>> java 30581 apache mem REG 253,0 4388371 18880440 >> _4i4a.cfs >> > >>>>> java 30581 apache mem REG 253,0 1151845 18880512 >> _4i7v.cfs >> > >>>>> java 30581 apache mem REG 253,0 941773 18880613 >> _4i7x.cfs >> > >>>>> java 30581 apache mem REG 253,0 984023 18880588 >> _4i7o.cfs >> > >>>>> java 30581 apache mem REG 253,0 1790005 18880619 >> _4i7y.cfs >> > >>>>> java 30581 apache mem REG 253,0 466371 18880515 >> _4ct1.nvd >> > >>>>> java 30581 apache mem REG 253,0 723280 18880573 >> _4i7q.cfs >> > >>>>> java 30581 apache mem REG 253,0 806289 18880517 >> _4i7h.cfs >> > >>>>> java 30581 apache mem REG 253,0 17362 18880520 >> _4i9s.cfs >> > >>>>> java 30581 apache mem REG 253,0 698362 18880531 >> _4i9r.cfs >> > >>>>> java 30581 apache mem REG 253,0 483215 18880406 >> _4i5a.nvd >> > >>>>> java 30581 apache mem REG 253,0 14110 18880416 >> _4i9v.cfs >> > >>>>> java 30581 apache mem REG 253,0 6121 18880412 >> _4i9t.cfs >> > >>>>> java 30581 apache 30wW REG 253,0 0 18877901 >> write.lock >> > >>>>> >> > >>>>> Output of some of the biggest files in the index directory: >> > >>>>> >> > >>>>> -rw-r--r--. 1 apache apache 358684577 Nov 11 08:04 _4fjn.cfs >> > >>>>> -rw-r--r--. 1 apache apache 363493917 Nov 11 07:54 >> > >> _4ct1_Lucene50_0.dvd >> > >>>>> -rw-r--r--. 1 apache apache 369563337 Nov 11 08:06 >> > >> _4gs5_Lucene50_0.tim >> > >>>>> -rw-r--r--. 1 apache apache 372579599 Nov 11 08:09 >> > >> _4i5a_Lucene50_0.dvd >> > >>>>> -rw-r--r--. 1 apache apache 378055201 Nov 11 08:06 >> > >> _4gs5_Lucene50_0.doc >> > >>>>> -rw-r--r--. 1 apache apache 427401813 Nov 10 08:14 _3ou7.cfs >> > >>>>> -rw-r--r--. 1 apache apache 505758610 Nov 11 08:04 _4gs5.fdt >> > >>>>> -rw-r--r--. 1 apache apache 1107391579 Nov 10 07:55 >> > >> _3k3a_Lucene50_0.dvd >> > >>>>> -rw-r--r--. 1 apache apache 3176094924 Nov 11 08:10 >> > >> _4gs5_Lucene50_0.dvd >> > >>>>> >> > >>>>> Note that the 3ou7 and 3k3a segments no longer appear to be in use? >> > >>>> >> > >>>> >> --------------------------------------------------------------------- >> > >>>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > >>>> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >>>> >> > >>>> >> > >> >> > >> --------------------------------------------------------------------- >> > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> >> > >> >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org