[ http://issues.apache.org/jira/browse/LUCENE-480?page=comments#action_12359755 ]
Jeremy Calvert commented on LUCENE-480: --------------------------------------- A little more data: int fieldNumber = fieldsStream.readVInt(); on line 68 of FieldsReader.java results in fieldNumber = 221997 for my particular fieldsStream, so it would seem that my proposed patch would indeed just gloss over a larger problem wherein the fieldsStream is getting corrupted. On the other hand, having this cause an NPE seems less than ideal. Is there some way to throw an exception that's more indicative of the stream corruption? In any case, I'm tracing back how this happened in the first place. I would simply give you the code and data to reproduce it, but the data is ~500M worth. Stay tuned! > NullPointerException during IndexWriter.mergeSegments > ----------------------------------------------------- > > Key: LUCENE-480 > URL: http://issues.apache.org/jira/browse/LUCENE-480 > Project: Lucene - Java > Type: Bug > Components: Index > Versions: CVS Nightly - Specify date in submission, 1.9 > Environment: 64bit, ubuntu, Java 5 SE > Reporter: Jeremy Calvert > > Last commit on culprit org.apache.lucene.index.FieldsReader: Sun Oct 30 > 05:38:46 2005. > --------------------------------------------------------- > Offending code in FieldsReader.java: > ... > final Document doc(int n) throws IOException { > indexStream.seek(n * 8L); > long position = indexStream.readLong(); > fieldsStream.seek(position); > Document doc = new Document(); > int numFields = fieldsStream.readVInt(); > for (int i = 0; i < numFields; i++) { > int fieldNumber = fieldsStream.readVInt(); > FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); > // > // This apparently returns null, presumably either as a result of: > // catch (IndexOutOfBoundsException ioobe) { > // return null; > // } > // in fieldInfos.fieldInfo(int fieldNumber) > // - or - > // because there's a null member of member ArrayList byNumber of FieldInfos > byte bits = fieldsStream.readByte(); > > boolean compressed = (bits & FieldsWriter.FIELD_IS_COMPRESSED) != 0; > .... > Field.Store store = Field.Store.YES; > // > // Here --v is where the NPE is thrown. > if (fi.isIndexed && tokenize) > index = Field.Index.TOKENIZED; > ... > --------------------------------------------------------- > Proposed Patch: > I'm not sure what the behavior should be in this case, but if it's no big > deal that there's null field info for an index and we should just ignore that > index, an obvious patch could be: > In FieldsReader.java: > ... > for (int i = 0; i < numFields; i++) { > int fieldNumber = fieldsStream.readVInt(); > FieldInfo fi = fieldInfos.fieldInfo(fieldNumber); > // vvvPatchvvv > if(fi == null) {continue;} > byte bits = fieldsStream.readByte(); > ... > --------------------------------------------------------- > Other observations: > In my search prior to submitting this issue, I found LUCENE-168, which looks > similar, and is perhaps related, but if so, I'm not sure exactly how. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]