I am getting the following exception when running against trunk:
java.lang.IllegalArgumentException: at least one term (length 20079)
exceeds max term length 16383; these terms were skipped
at
org
.apache.lucene.index.IndexWriter.checkMaxTermLength(IndexWriter.java:
1545)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1451)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1411)
....
I'm wondering if the IndexWriter should throw an explicit exception in
this case as opposed to a RuntimeException, as it seems to me really
long tokens should be handled more gracefully. It seems strange that
the message says the terms were skipped (which the code does in fact
do), but then there is a RuntimeException thrown which usually
indicates to me the issue is not recoverable. I am using the
StandardTokenizer, but I don't think that much matters.
Any thoughts on this?
-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]