scalability limits getDetails, mapFile Readers?

Stefan Groschupf Wed, 01 Mar 2006 15:30:16 -0800

Hi,

We run into a problem with nutch usingMapFileOutputFormat#getReaders and getEntry.In detail this happens until summary generation where we open foreach segment as much readers as much parts (part-0000 to part-n) wehave.

Having 80 tasktracker and 80 segments means:

80 x 80 x 4 (parseData, parseText, content, crawl). A search serveralso needs to open as much files as required for the index searcher.

So the problem is a FileNotFoundException, (Too many open files).

Opening and closing Readers for each Detail makes no sense. We maycan limit the number of readers somehow and close the readers thatwasn't used since the longest time.But I'm not that happy with this solution, so any thoughts how we cansolve this problem in general?


Thanks.
Stefan

P.S. We also note that the .crc also double the number of open files.

scalability limits getDetails, mapFile Readers?

Reply via email to