You're right. I guess I misunderstood the term hard limit
when talking about file descriptor limits.
Still, why is Nutch opening so many file descriptors during
merge or reparse? 2000+ open file descriptors doesn't seem
intentional. Plus, my DB is not that big (~1M pages).
Has anyone seen this with 0.8?
You can use ulimit -n to increase the limit on Unix/Linux systems.
Rgrds, Thomas
On 6/13/06, Howie Wang <[EMAIL PROTECTED]> wrote:
Hi,
I think I remember seeing some messages about "Too many open files"
when merging a while ago. I recently started getting this on Nutch 0.7
using JDK 1.4.2 on WinXP while I was trying to reparse a segment.
I looked around and I found this Java bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4189011
It looks like on Windows, you can work around this by switching to
JDK 1.5. I didn't recompile since I thought 0.7 didn't compile against
JDK 1.5. It seemed to run fine for me by just using the 1.4 compiled
jar file and using the 1.5 Java executable.
It still seems that this will be an issue for other platforms, which have
hard limits on the number of file descriptors. Is there a file descriptor
leak in the merge or parsing code?
Howie