Liu, Linhong created SPARK-26068: ------------------------------------ Summary: ChunkedByteBufferInputStream is truncated by empty chunk Key: SPARK-26068 URL: https://issues.apache.org/jira/browse/SPARK-26068 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 3.0.0 Reporter: Liu, Linhong
If ChunkedByteBuffer contains empty chunk in the middle of it, then the ChunkedByteBufferInputStream will be truncated. All data behind the empty chunk will not be read. The problematic code {code:java} // ChunkedByteBuffer.scala // Assume chunks.next returns an empty chunk, then we will reach // else branch no matter chunks.hasNext = true or not. So some data is lost. override def read(dest: Array[Byte], offset: Int, length: Int): Int = { if (currentChunk != null && !currentChunk.hasRemaining && chunks.hasNext) { currentChunk = chunks.next() } if (currentChunk != null && currentChunk.hasRemaining) { val amountToGet = math.min(currentChunk.remaining(), length) currentChunk.get(dest, offset, amountToGet) amountToGet } else { close() -1 } } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org