[
https://issues.apache.org/jira/browse/PIG-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai resolved PIG-2514.
-----------------------------
Resolution: Fixed
Hadoop Flags: Reviewed
Unit test pass. test-patch:
[exec] -1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 3 new or
modified tests.
[exec]
[exec] -1 javadoc. The javadoc tool appears to have generated 1
warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] -1 release audit. The applied patch generated 535 release
audit warnings (more than the trunk's current 530 warnings).
javadoc and release audit warning is unrelated.
Patch committed to trunk.
Thanks Romain!
> REGEX_EXTRACT not returning correct group with non greedy regex
> ---------------------------------------------------------------
>
> Key: PIG-2514
> URL: https://issues.apache.org/jira/browse/PIG-2514
> Project: Pig
> Issue Type: Bug
> Components: internal-udfs
> Affects Versions: 0.11
> Reporter: Romain Rigaux
> Assignee: Romain Rigaux
> Priority: Minor
> Fix For: 0.11
>
> Attachments: PIG-2514-doc.patch, PIG-2514.2.patch, PIG-2514.patch
>
>
> Hello,
> REGEX_EXTRACT is using Matcher.find() instead of Matcher.matches() and so
> does not work with some non greedy regular expression.
> Is it the wanted behavior?
> Thanks,
> Romain
> http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html
> The matches method attempts to match the entire input sequence against the
> pattern.
> The find method scans the input sequence looking for the next subsequence
> that matches the pattern.
> System.out.println("Pig's way with m.find()");
> String a = "hdfs://mygrid.com/projects/";
> Matcher m = Pattern.compile("(.+?)/?").matcher(a);
> System.out.println(m.find());
> System.out.println(m.group(1));
> System.out.println(m.start());
> System.out.println(m.end());
> System.out.println("\nm.matches()");
> a = "hdfs://mygrid.com/projects/";
> m = Pattern.compile("(.+?)/?").matcher(a);
> System.out.println(m.matches());
> System.out.println(m.group(1));
> System.out.println(m.start());
> System.out.println(m.end());
> System.out.println("\nREGEX_EXTRACT m.find()");
> Tuple t = TupleFactory.getInstance().newTuple();
> t.append(a);
> t.append("(.+?)/?");
> t.append(1);
> System.out.println(new TestPigExtractAll().new REGEX_EXTRACT().exec(t));
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira