What version of Lucene are you using? Are you removing the index
completely and rebuilding it from scratch with the compound flag
enabled (by default since 1.4)? You really shouldn't have massive
numbers of files created when using the compound format, so I suspect
something is fishy with how you're indexing.
Erik
On Mar 16, 2006, at 5:46 PM, Nick Atkins wrote:
Thanks Hannes, on my Fedora machine the maximum I can do is ulimit -n
1048576 which is 1M files. This should be enough for most sane cases
but it makes me uneasy. I assume the "deleted" file entries
reported by
lsof will be cleared up eventually?
I can't believe this is really the only option and there is no way
within Lucene to control the number of files opened. Hmm...
Thanks,
Nick.
Hannes Carl Meyer wrote:
Hi Nick,
use 'ulimit' on your ix system to check if its set to unlimited.
check:
http://wwwcgi.rdg.ac.uk:8081/cgi-bin/cgiwrap/wsi14/poplog/man/2/
ulimit
You don't have to set it to unlimited, maybe increasing the number
will help.
later
Hannes
Nick Atkins schrieb:
Thanks Otis, I tried that but I still get the same problem at the
ulimit
-n point. I assume you meant I should call
IndexWriter.setUseCompoundFile(true). According to the docs
compound
structure is the default anyway.
Any further thoughts? Anything I can tweak in the OS (Linux), Java
(1.5.0) or Lucene (1.9.1)?
Many thanks,
Nick
Otis Gospodnetic wrote:
The easiest first step to try is to go from multi-file index
structure to the compound one.
Otis
----- Original Message ----
From: Nick Atkins <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, March 16, 2006 3:00:59 PM
Subject: Lucene and Tomcat, too many open files
Hi,
What's the best way to manage the number of open files used by
Lucene
when it's running under Tomcat? I have a indexing application
running
as a web app and I index a huge number of mail messages (upwards of
40000 in some cases). Lucene's merging routine always craps out
eventually with the "too many open files" regardless of how
large I set
ulimit to. lsof tells me they are all "deleted" but they still
seem to
count as open files. I don't want to set ulimit to some
enormous value
just to solve this (because it will never be large enough).
What's the
best strategy here?
I have tried setting various parameters on the IndexWriter such
as the
MergeFactor, MaxMergeDocs and MaxBufferedDocs but they seem to only
affect the merge timing algorithm wrt memory usage. The number of
files
used seems to be unaffected by anything I can set on the
IndexWriter.
Any hints much appreciated.
Cheers,
Nick.
-------------------------------------------------------------------
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
-------------------------------------------------------------------
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]