[
https://issues.apache.org/jira/browse/BEAM-10883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17238249#comment-17238249
]
Brian Hulette commented on BEAM-10883:
--------------------------------------
Good find, thanks [~nielsbasjes]. That comment dates back to 2015, and we're
using a version of woodstox (woodstox-asl 4.4.1) from Sept 2014. Maybe we can
resolve this by upgrading woodstox (BEAM-8720)? Or better yet maybe the bug in
OpenJDK is resolved now and we don't even need it?
> Xmlio parsing of multibyte characters
> -------------------------------------
>
> Key: BEAM-10883
> URL: https://issues.apache.org/jira/browse/BEAM-10883
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Duncan Lew
> Priority: P1
> Labels: Clarified
>
> We are running into issues with parsing multi-byte characters that result in
> duplicates and/or data loss as described in this document:
> https://beam.apache.org/releases/javadoc/2.0.0/org/apache/beam/sdk/io/xml/XmlIO.html
--
This message was sent by Atlassian Jira
(v8.3.4#803005)