Romster commented on a change in pull request #13513:
URL: https://github.com/apache/beam/pull/13513#discussion_r544083679



##########
File path: 
sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlSource.java
##########
@@ -281,7 +283,15 @@ private long getFirstOccurenceOfRecordElement(
               break outer;
             } else {
               // Matching was unsuccessful. Reset the buffer to include bytes 
read for the char.
-              ByteBuffer newbuf = ByteBuffer.allocate(BUF_SIZE);
+              int bytesToWrite = buf.remaining() + charBytes.length;

Review comment:
       charBytes is an array of 4 bytes, we've read from the buffer.
   this code caused the buffer overflow because that charBytes can be the input 
from the previous reading from the channel.
   ```
                 newbuf.put(charBytes);
                 offsetInFileOfCurrentByte -= charBytes.length;
                 while (buf.hasRemaining()) {
                   newbuf.put(buf.get());
                 }
   ```
   so when we read `BUF_SIZE` (or BUF_SIZE-n where n <charBytes.length) bytes 
from the channel to `buf` and go here, we have more byte to write to the newBuf 
than it's capacity. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to