chamikaramj commented on a change in pull request #13513:
URL: https://github.com/apache/beam/pull/13513#discussion_r543921078



##########
File path: 
sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/XmlSourceTest.java
##########
@@ -873,6 +881,46 @@ public void testSplitAtFractionExhaustiveSingleByte() 
throws Exception {
     assertSplitAtFractionExhaustive(source, options);
   }
 
+  @Test
+  public void testNoBufferOverflowThrown() throws IOException {
+    // The magicNumber was found imperatively and will be different for 
different xml content.
+    // Test with the current setup causes BufferOverflow in
+    // XMLReader#getFirstOccurenceOfRecordElement method,
+    // if the specific corner case is not handled
+    final int magicNumber = 183;
+    StringBuilder sb = new StringBuilder();

Review comment:
       I think your input here does not conform to the format required by the 
Xml source that is defined here: 
https://github.com/apache/beam/blob/master/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/XmlIO.java#L68
   
   Specifically it has to be of the following format.
   ```
   <root>
   <record> ... </record>
   <record> ... </record>
   <record> ... </record>
   ...
   <record> ... </record>
   </root>
   ```
   But you have additional element: `<trainTags><trainTag></trainTag> ... 
<trainTag></trainTag></trainTags>`
   
   Were you able to reproduce the issue when the input conforms to the required 
format ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to