Thanks Hannes, on my Fedora machine the maximum I can do is ulimit -n 1048576 which is 1M files. This should be enough for most sane cases but it makes me uneasy. I assume the "deleted" file entries reported by lsof will be cleared up eventually?
I can't believe this is really the only option and there is no way within Lucene to control the number of files opened. Hmm... Thanks, Nick. Hannes Carl Meyer wrote: > Hi Nick, > > use 'ulimit' on your ix system to check if its set to unlimited. > > check: > http://wwwcgi.rdg.ac.uk:8081/cgi-bin/cgiwrap/wsi14/poplog/man/2/ulimit > > You don't have to set it to unlimited, maybe increasing the number > will help. > > later > > Hannes > > Nick Atkins schrieb: >> Thanks Otis, I tried that but I still get the same problem at the ulimit >> -n point. I assume you meant I should call >> IndexWriter.setUseCompoundFile(true). According to the docs compound >> structure is the default anyway. >> >> Any further thoughts? Anything I can tweak in the OS (Linux), Java >> (1.5.0) or Lucene (1.9.1)? >> >> Many thanks, >> >> Nick >> >> Otis Gospodnetic wrote: >> >>> The easiest first step to try is to go from multi-file index >>> structure to the compound one. >>> >>> Otis >>> >>> ----- Original Message ---- >>> From: Nick Atkins <[EMAIL PROTECTED]> >>> To: [email protected] >>> Sent: Thursday, March 16, 2006 3:00:59 PM >>> Subject: Lucene and Tomcat, too many open files >>> >>> Hi, >>> >>> What's the best way to manage the number of open files used by Lucene >>> when it's running under Tomcat? I have a indexing application running >>> as a web app and I index a huge number of mail messages (upwards of >>> 40000 in some cases). Lucene's merging routine always craps out >>> eventually with the "too many open files" regardless of how large I set >>> ulimit to. lsof tells me they are all "deleted" but they still seem to >>> count as open files. I don't want to set ulimit to some enormous value >>> just to solve this (because it will never be large enough). What's the >>> best strategy here? >>> >>> I have tried setting various parameters on the IndexWriter such as the >>> MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only >>> affect the merge timing algorithm wrt memory usage. The number of >>> files >>> used seems to be unaffected by anything I can set on the IndexWriter. >>> >>> Any hints much appreciated. >>> >>> Cheers, >>> >>> Nick. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
