Ahmet Altay created BEAM-2386:
---------------------------------
Summary: Change regex used for splitting words
Key: BEAM-2386
URL: https://issues.apache.org/jira/browse/BEAM-2386
Project: Beam
Issue Type: Bug
Components: sdk-py
Reporter: Ahmet Altay
Priority: Minor
Regex used in splitting words ({{[A-Za-z\']+}}) only works on latin input,
change it to make it work on non-latin inputs.
For example, see Java version:
https://github.com/apache/beam/blob/367fcb28d544934797d25cb34d54136b2d7d6e99/examples/java/src/main/java/org/apache/beam/examples/common/ExampleUtils.java#L75
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)