Andrzej Bialecki wrote:
[EMAIL PROTECTED] wrote:
Interesting to know. However I never had this good luck, I got everytime a unexpected EOF Exception.

Yeah, that's the symptom of missing index.

I thought i'd fixed this some time ago. One still might get an when iterating through entries from a truncated segment, but no longer when opening it. So it should always be possible to read all the entries that were flushed: an index file should always be present, and EOF on the index file should be trapped, generating only a warning.


May this would one of the useful improvements to make nutch more error restent.


Actually, it is possible to make it more resilient to crashes by setting MapFile.Writer.setIndexInterval() to a smaller value (default 128, most likely it should be read from the config), and then by making BufferedRandomAccessFile.flushBuffer() method public, so that the SequenceFile.Writer may call it after each index append - this way not only the index will be always written quickly (as if it were unbuffered), but also more frequently, resulting in smaller "chunks" of possibly lost data.

Are you certain that the index is the problem?

Perhaps instead one could just trap EOF in MapFile.Reader.next() to generate a warning and return null?

Doug



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Nutch-general mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to