Romster commented on a change in pull request #13513:
URL: https://github.com/apache/beam/pull/13513#discussion_r544087349



##########
File path: 
sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/XmlSourceTest.java
##########
@@ -873,6 +881,46 @@ public void testSplitAtFractionExhaustiveSingleByte() 
throws Exception {
     assertSplitAtFractionExhaustive(source, options);
   }
 
+  @Test
+  public void testNoBufferOverflowThrown() throws IOException {
+    // The magicNumber was found imperatively and will be different for 
different xml content.
+    // Test with the current setup causes BufferOverflow in
+    // XMLReader#getFirstOccurenceOfRecordElement method,
+    // if the specific corner case is not handled
+    final int magicNumber = 183;
+    StringBuilder sb = new StringBuilder();

Review comment:
       This is an artificial example, but the main condition is that you have 
some amout of `<recordBlahBlah>` tags
   The real case looks like
   ```
   <root>
    <record> 
     <recordSomething>
     </recordSomething>
    </record>
    <record> 
     <recordSomething>
     </recordSomething>
    </record>
   ...
    <record> 
     <recordSomething>
     </recordSomething>
    </record>
   </root>
   ```
   The behaviour seems to be environment-dependent, so I'm not sure if even my 
example will be reproduced in another environment (it depends also on how many 
bytes we are reading  from the channel)

##########
File path: 
sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/XmlSourceTest.java
##########
@@ -873,6 +881,46 @@ public void testSplitAtFractionExhaustiveSingleByte() 
throws Exception {
     assertSplitAtFractionExhaustive(source, options);
   }
 
+  @Test
+  public void testNoBufferOverflowThrown() throws IOException {
+    // The magicNumber was found imperatively and will be different for 
different xml content.
+    // Test with the current setup causes BufferOverflow in
+    // XMLReader#getFirstOccurenceOfRecordElement method,
+    // if the specific corner case is not handled
+    final int magicNumber = 183;
+    StringBuilder sb = new StringBuilder();

Review comment:
       This is an artificial example, but the main condition is that you have 
some amount of `<recordBlahBlah>` tags
   The real case looks like
   ```
   <root>
    <record> 
     <recordSomething>
     </recordSomething>
    </record>
    <record> 
     <recordSomething>
     </recordSomething>
    </record>
   ...
    <record> 
     <recordSomething>
     </recordSomething>
    </record>
   </root>
   ```
   The behaviour seems to be environment-dependent, so I'm not sure if even my 
example will be reproduced in another environment (it depends also on how many 
bytes we are reading  from the channel)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to