Re: DocumentsWriter.checkMaxTermLength issues

Grant Ingersoll Thu, 20 Dec 2007 08:37:04 -0800


On Dec 20, 2007, at 10:55 AM, Yonik Seeley wrote:

On Dec 20, 2007 9:41 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:

I'm wondering if the IndexWriter should throw an explicit exceptionin
this case as opposed to a RuntimeException,


RuntimeExceptions can happen in analysis components during indexing
anyway, so it seems like indexing code should deal with exceptions
just to be safe.  As long as exceptions happinging during indexing
don't mess up the indexing code, everything should be OK.

as it seems to me really
long tokens should be handled more gracefully.  It seems strange that
the message says the terms were skipped (which the code does in fact
do), but then there is a RuntimeException thrown which usually
indicates to me the issue is not recoverable.


It does seem like the document shouldn't be added at all if it caused
an exception.

Is that what happens if one of the analyzers causes an exception tobe thrown?


The other option is to simply ignore tokens above 16K... I'm not sure
what's right here.

+1. The code already does ignore them, that is why the exceptionseems so weird. DocsWriter gracefully handles the problem, but thenthrows up after the fact. I would vote to just log it or let the userdecide somehow.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: DocumentsWriter.checkMaxTermLength issues

Reply via email to