Yonik Seeley wrote:
On Dec 20, 2007 9:41 AM, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
I'm wondering if the IndexWriter should throw an explicit
exception in
this case as opposed to a RuntimeException,
RuntimeExceptions can happen in analysis components during indexing
anyway, so it seems like indexing code should deal with exceptions
just to be safe. As long as exceptions happinging during indexing
don't mess up the indexing code, everything should be OK.
as it seems to me really
long tokens should be handled more gracefully. It seems strange that
the message says the terms were skipped (which the code does in fact
do), but then there is a RuntimeException thrown which usually
indicates to me the issue is not recoverable.
It does seem like the document shouldn't be added at all if it caused
an exception.
Is that what happens if one of the analyzers causes an exception to
be thrown?
The other option is to simply ignore tokens above 16K... I'm not sure
what's right here.
Though ... we could simply immediately delete the document when any
exception occurs during its processing. So if we think whenever any
doc hits an exception, then it should be deleted, it's not so hard to
implement that policy...
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]