[ 
https://issues.apache.org/jira/browse/LUCENE-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548435
 ] 

Michael Busch commented on LUCENE-1072:
---------------------------------------

Thanks for the quick fix, Mike. All unit tests, incl. the new one, pass.

I also added this patch to the Lucene version in our app and it works
fine now. So even after the TokenStream throws a RuntimeException
the DocsWriter is still usable for subsequent docs.

+1 for committing this soon!!

> NullPointerException during indexing in 
> DocumentsWriter$ThreadState$FieldData.addPosition
> -----------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1072
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1072
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.3
>         Environment: Linux CentOS 5 x86_64 running on 2-core Pentium D, Java 
> HotSpot(TM) 64-Bit Server VM (build 1.6.0_01-b06, mixed mode), using 
> lucene-core-2007-11-29_02-49-31
>            Reporter: Alexei Dets
>            Assignee: Michael McCandless
>             Fix For: 2.3
>
>         Attachments: LUCENE-1072.patch, LUCENE-1072.take2.patch
>
>
> In my case during indexing sometimes appear documents with unusually large 
> "words" - text-encoded images in fact.
> Attempt to add document that contains field with such token produces 
> java.lang.IllegalArgumentException:
> java.lang.IllegalArgumentException: term length 37944 exceeds max term length 
> 16383
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.addPosition(DocumentsWriter.java:1492)
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.invertField(DocumentsWriter.java:1321)
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1247)
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:972)
>         at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2202)
>         at 
> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2186)
>         at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1432)
>         at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1411)
> This is expected, exception is caught and ignored. The problem is that after 
> this IndexWriter becomes somewhat corrupted and subsequent attempts to add 
> documents to the index fail as well, this time with NPE:
> java.lang.NullPointerException
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.addPosition(DocumentsWriter.java:1497)
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.invertField(DocumentsWriter.java:1321)
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1247)
>         at 
> org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:972)
>         at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2202)
>         at 
> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2186)
>         at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1432)
>         at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1411)
> This is 100% reproducible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to