Mulugeta Mammo created LUCENE-8624: -------------------------------------- Summary: ByteBuffersDataOutput Integer Overflow Key: LUCENE-8624 URL: https://issues.apache.org/jira/browse/LUCENE-8624 Project: Lucene - Core Issue Type: Bug Components: core/store Affects Versions: 7.5 Reporter: Mulugeta Mammo Fix For: 7.5
Hi, When indexing large data sets with ByteBuffersDirectory, an exception like the below is thrown: {{}}Caused by: java.lang.IllegalArgumentException: cannot write negative vLong (got: -4294888321) at org.apache.lucene.store.DataOutput.writeVLong(DataOutput.java:225) at org.apache.lucene.codecs.lucene50.Lucene50SkipWriter.writeSkipData(Lucene50SkipWriter.java:182) at org.apache.lucene.codecs.MultiLevelSkipListWriter.bufferSkip(MultiLevelSkipListWriter.java:143) at org.apache.lucene.codecs.lucene50.Lucene50SkipWriter.bufferSkip(Lucene50SkipWriter.java:162) at org.apache.lucene.codecs.lucene50.Lucene50PostingsWriter.startDoc(Lucene50PostingsWriter.java:228) at org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:148) at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:865) at org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:344) at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:169) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:244) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4453) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4075) {{The exception is caused by an integer overflow while calling getFilePointer() in Lucene50PostingsWriter, which eventually calls the size() method in ByteBuffersDataOutput.}} {{{code:title=ByteBuffersDataOutput.java|borderStyle=solid} }} public long size() { long size = 0; int blockCount = blocks.size(); if (blockCount >= 1) { {color:#FF0000}int fullBlockSize = (blockCount - 1) * blockSize();{color} int lastBlockSize = blocks.getLast().position(); size = fullBlockSize + lastBlockSize; } return size; } {code} In my case, I had a blockCount = 65 and a blockSize() = 33554432 which overflows fullBlockSize. The fix: {{{code:title=ByteBuffersDataOutput.java|borderStyle=solid}}} public long size() { long size = 0; int blockCount = blocks.size(); if (blockCount >= 1) { {color:#FF0000}long fullBlockSize = 1L * (blockCount - 1) * blockSize();{color} int lastBlockSize = blocks.getLast().position(); size = fullBlockSize + lastBlockSize; } return size; } {code} Thanks -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org