Yes, indexing only right now, although I can issue the odd search to test it's being built properly.
My test (indexing 40000+ message's in a user's mailbox) causes BatchUpdater thread to write everything to the index approx every 15-17 seconds. The logs say: [EMAIL PROTECTED] bin]# tail -f ../logs/scalix-sis-indexer.log | grep close 2006-03-16 15:50:30,591 DEBUG [BatchUpdater.processMods:122] Writer optimized and closed 2006-03-16 15:50:47,049 DEBUG [BatchUpdater.processMods:122] Writer optimized and closed and the index grows which tells me it's working OK. My indexes are in /tmp/lucene and my parent java (Tomcat) process is pid 639. I get this: [EMAIL PROTECTED] bin]# lsof -p 639 | grep \/tmp\/lucene | wc -l 51422 and growing. When it's done I'll try another smaller mailbox with the default ulimit and your suggestions and see if that helps. Cheers, Nick. Otis Gospodnetic wrote: > This happens when you are doing indexing only!? Wow, I've never seen that. > Try posting your code in a form of a unit test. > > Otis > > ----- Original Message ---- > From: Nick Atkins <[EMAIL PROTECTED]> > To: [email protected] > Sent: Thursday, March 16, 2006 6:28:52 PM > Subject: Re: Lucene and Tomcat, too many open files > > Hi Doug, > > I have experimented with a mergeFactor of 5 or 10 (default) but it > didn't help matters once I reached the ulimit. I understand how the > mergeFactor affects Lucene's performance. > > I am actually not doing any searches with IndexReader right now, just > indexing. Yes, I do store and reuse the IndexReader per index, but my > test right now is purely indexing. > > Setting my ulimit to 1M seems to do the trick for now, but I'm sure I > could do better by tweaking Lucene's API. I will continue to investigate. > > Thanks, > > Nick. > > Doug Cutting wrote: > >> Are you changing the default mergeFactor or other settings? If so, >> how? Large mergeFactors are generally a bad idea: they don't make >> things faster in the long run and they chew up file handles. >> >> Are all searches reusing a single IndexReader? They should. This is >> the other most common reason folks run out of file handles: they open >> too many IndexReaders. The exception may be thrown when merging, but >> the root cause might be something else. >> >> Doug >> >> Nick Atkins wrote: >> >>> Hi, >>> >>> What's the best way to manage the number of open files used by Lucene >>> when it's running under Tomcat? I have a indexing application running >>> as a web app and I index a huge number of mail messages (upwards of >>> 40000 in some cases). Lucene's merging routine always craps out >>> eventually with the "too many open files" regardless of how large I set >>> ulimit to. lsof tells me they are all "deleted" but they still seem to >>> count as open files. I don't want to set ulimit to some enormous value >>> just to solve this (because it will never be large enough). What's the >>> best strategy here? >>> >>> I have tried setting various parameters on the IndexWriter such as the >>> MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only >>> affect the merge timing algorithm wrt memory usage. The number of files >>> used seems to be unaffected by anything I can set on the IndexWriter. >>> >>> Any hints much appreciated. >>> >>> Cheers, >>> >>> Nick. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >
