[ 
https://issues.apache.org/jira/browse/PIG-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13217819#comment-13217819
 ] 

Romain Rigaux commented on PIG-2514:
------------------------------------

I can add a it to REGEX_EXTRACT_ALL, just the defaults will be reversed as 
REGEX_EXTRACT_ALL is using matches() by default (and REGEX_EXTRACT find() by 
default).
                
> REGEX_EXTRACT not returning correct group with non greedy regex
> ---------------------------------------------------------------
>
>                 Key: PIG-2514
>                 URL: https://issues.apache.org/jira/browse/PIG-2514
>             Project: Pig
>          Issue Type: Bug
>          Components: internal-udfs
>    Affects Versions: 0.11
>            Reporter: Romain Rigaux
>            Assignee: Romain Rigaux
>            Priority: Minor
>             Fix For: 0.11
>
>         Attachments: PIG-2514-doc.patch, PIG-2514.patch
>
>
> Hello,
> REGEX_EXTRACT is using Matcher.find() instead of Matcher.matches() and so 
> does not work with some non greedy regular expression.
> Is it the wanted behavior?
> Thanks,
> Romain
> http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html
> The matches method attempts to match the entire input sequence against the 
> pattern.
> The find method scans the input sequence looking for the next subsequence 
> that matches the pattern.
>     System.out.println("Pig's way with m.find()");
>     String a = "hdfs://mygrid.com/projects/";
>     Matcher m = Pattern.compile("(.+?)/?").matcher(a);
>     System.out.println(m.find());
>     System.out.println(m.group(1));
>     System.out.println(m.start());
>     System.out.println(m.end());
>     System.out.println("\nm.matches()");
>     a = "hdfs://mygrid.com/projects/";
>     m = Pattern.compile("(.+?)/?").matcher(a);
>     System.out.println(m.matches());
>     System.out.println(m.group(1));
>     System.out.println(m.start());
>     System.out.println(m.end());
>     System.out.println("\nREGEX_EXTRACT m.find()");
>     Tuple t = TupleFactory.getInstance().newTuple();
>     t.append(a);
>     t.append("(.+?)/?");
>     t.append(1);
>     System.out.println(new TestPigExtractAll().new REGEX_EXTRACT().exec(t));

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to