Romster commented on a change in pull request #13513:
URL: https://github.com/apache/beam/pull/13513#discussion_r544083679
##########
File path:
sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlSource.java
##########
@@ -281,7 +283,15 @@ private long getFirstOccurenceOfRecordElement(
break outer;
} else {
// Matching was unsuccessful. Reset the buffer to include bytes
read for the char.
- ByteBuffer newbuf = ByteBuffer.allocate(BUF_SIZE);
+ int bytesToWrite = buf.remaining() + charBytes.length;
Review comment:
charBytes is an array of 4 bytes, we've read from the buffer.
this code caused the buffer overflow because that charBytes can be the input
from the previous reading from the channel.
```
newbuf.put(charBytes);
offsetInFileOfCurrentByte -= charBytes.length;
while (buf.hasRemaining()) {
newbuf.put(buf.get());
}
```
so when we read `BUF_SIZE` (or BUF_SIZE-n where n <charBytes.length) bytes
from the channel to `buf` and go here, we have more byte to write to the newBuf
than it's capacity.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]