Romster commented on a change in pull request #13513:
URL: https://github.com/apache/beam/pull/13513#discussion_r544083679



##########
File path: 
sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlSource.java
##########
@@ -281,7 +283,15 @@ private long getFirstOccurenceOfRecordElement(
               break outer;
             } else {
               // Matching was unsuccessful. Reset the buffer to include bytes 
read for the char.
-              ByteBuffer newbuf = ByteBuffer.allocate(BUF_SIZE);
+              int bytesToWrite = buf.remaining() + charBytes.length;

Review comment:
       charBytes is an array of 4 bytes, we've read from the buffer.
   this code caused the buffer overflow because that charBytes can be the input 
from the previous reading from the channel.
   ```
                 newbuf.put(charBytes);
                 offsetInFileOfCurrentByte -= charBytes.length;
                 while (buf.hasRemaining()) {
                   newbuf.put(buf.get());
                 }
   ```
   so when we do the next reading of `BUF_SIZE` (or BUF_SIZE-n where n 
<charBytes.length) bytes from the channel to `buf` and go here, we have more 
byte to write to the newBuf than it's capacity. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to