I am getting the following exception when running against trunk:
java.lang.IllegalArgumentException: at least one term (length 20079) exceeds max term length 16383; these terms were skipped at org .apache.lucene.index.IndexWriter.checkMaxTermLength(IndexWriter.java: 1545) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1451) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1411)
....

I'm wondering if the IndexWriter should throw an explicit exception in this case as opposed to a RuntimeException, as it seems to me really long tokens should be handled more gracefully. It seems strange that the message says the terms were skipped (which the code does in fact do), but then there is a RuntimeException thrown which usually indicates to me the issue is not recoverable. I am using the StandardTokenizer, but I don't think that much matters.

Any thoughts on this?

-Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to