Péter Gergő Barna created BEAM-2094:
---------------------------------------
Summary: WordCount examples produce garbage for non-English input
text
Key: BEAM-2094
URL: https://issues.apache.org/jira/browse/BEAM-2094
Project: Beam
Issue Type: Bug
Components: examples-java
Reporter: Péter Gergő Barna
Assignee: Frances Perry
Priority: Trivial
Fix For: First stable release
WordCount examples produce garbage for non-English input text.
The reason for this is the split pattern used throughout the wordcount examples:
word.split("[^a-zA-Z']+")
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)