Lukasz Antoniak created CASSANALYTICS-147:
---------------------------------------------
Summary: BufferingInputStream fails to read last unaligned chunk
Key: CASSANALYTICS-147
URL: https://issues.apache.org/jira/browse/CASSANALYTICS-147
Project: Apache Cassandra Analytics
Issue Type: Bug
Components: Reader
Reporter: Lukasz Antoniak
Reading BTI partition index fails when reading trailer of the file that is not
aligned within 4096-byte pages.
{code:java}
Caused by: FSReadError
at org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:157)
at
org.apache.cassandra.io.util.SimpleChunkReader.readChunk(SimpleChunkReader.java:52)
at
org.apache.cassandra.io.util.BufferManagingRebufferer.rebuffer(BufferManagingRebufferer.java:88)
at
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:82)
at
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:67)
at
org.apache.cassandra.io.util.RebufferingInputStream.readByte(RebufferingInputStream.java:185)
at
org.apache.cassandra.io.util.RebufferingInputStream.readBigEndianPrimitiveSlowly(RebufferingInputStream.java:149)
at
org.apache.cassandra.io.util.RebufferingInputStream.readLong(RebufferingInputStream.java:243)
at
org.apache.cassandra.io.sstable.format.bti.PartitionIndex.load(PartitionIndex.java:226)
at
org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:256)
at
org.apache.cassandra.spark.reader.ReaderUtils.keysFromIndex(ReaderUtils.java:231)
at
org.apache.cassandra.spark.reader.SSTableCache.lambda$keysFromIndex$1(SSTableCache.java:123)
at
com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4903)
at
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3574)
at
com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2316)
at
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2190)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2080)
... 13 more {code}
Assume an index file of size 2527234. {{PartitionIndex}} will first read 4096
bytes at position 2523136, and then remaining 2 starting at position 2527232.
{{BufferingInputStream}} fails to read last one byte due to +1 shift in
computed read range. As a consequence, FINISH marker is added too soon and EOF
error is raised.
Unit test implemented in the PR shows the faulty behaviour.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]