Romster commented on a change in pull request #13513:
URL: https://github.com/apache/beam/pull/13513#discussion_r544307401
##########
File path:
sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/XmlSourceTest.java
##########
@@ -873,6 +881,46 @@ public void testSplitAtFractionExhaustiveSingleByte()
throws Exception {
assertSplitAtFractionExhaustive(source, options);
}
+ @Test
+ public void testNoBufferOverflowThrown() throws IOException {
+ // The magicNumber was found imperatively and will be different for
different xml content.
+ // Test with the current setup causes BufferOverflow in
+ // XMLReader#getFirstOccurenceOfRecordElement method,
+ // if the specific corner case is not handled
+ final int magicNumber = 183;
+ StringBuilder sb = new StringBuilder();
Review comment:
I've updated the test - now it uses input like
```
<trains>
<train>
<trainTags>
<trainTag>0</trainTag>
<trainTag>1</trainTag>
<trainTag>2</trainTag>
...
```
So the format issue is not the case.
To reproduce the error I had to use TestPipeline - so the input was split
into bundles
> INFO: Splitting filepattern
/var/folders/j5/2qx0r7453tvd56zpjbstv6fw4m_zm3/T/junit946409719265813413/trainXMLWithTags
into bundles of size 126 took 1 ms and produced 1 files and 20 bundles
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]