On Dec 20, 2007, at 10:55 AM, Yonik Seeley wrote:

On Dec 20, 2007 9:41 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
I'm wondering if the IndexWriter should throw an explicit exception in
this case as opposed to a RuntimeException,

RuntimeExceptions can happen in analysis components during indexing
anyway, so it seems like indexing code should deal with exceptions
just to be safe.  As long as exceptions happinging during indexing
don't mess up the indexing code, everything should be OK.

as it seems to me really
long tokens should be handled more gracefully.  It seems strange that
the message says the terms were skipped (which the code does in fact
do), but then there is a RuntimeException thrown which usually
indicates to me the issue is not recoverable.

It does seem like the document shouldn't be added at all if it caused
an exception.
Is that what happens if one of the analyzers causes an exception to be thrown?

The other option is to simply ignore tokens above 16K... I'm not sure
what's right here.

+1. The code already does ignore them, that is why the exception seems so weird. DocsWriter gracefully handles the problem, but then throws up after the fact. I would vote to just log it or let the user decide somehow.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to