Adrien Bidault created PIG-4507:
-----------------------------------
Summary: Problem with REGEX which just match for the first word
Key: PIG-4507
URL: https://issues.apache.org/jira/browse/PIG-4507
Project: Pig
Issue Type: Bug
Affects Versions: 0.12.0
Environment: IBM Infosphere BigInsights v3.0.0.1
Reporter: Adrien Bidault
I am trying to eliminate punctuation and special symbols from a string using
REGEX of a type "(\\w+)". The problem is that this REGEX treatment is applied
to the first word of the string only.
Example:
clean3 = FOREACH clean1 GENERATE id, REGEX_EXTRACT_ALL('toto, likes ... to
play ', '(\\w+)');
It just resturn "toto" instead of "toto likes to play"
Would you guys have any ideas?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)