We debated this a bit internally here. I'm not extracting a group per se. I'm parsing a zip code from the end of a postal address field as follows, and it works fine:
REGEX_EXTRACT(address,'[\\d-]+$',0) AS zip -----Original Message----- From: Vitalii Tymchyshyn [mailto:tiv...@gmail.com] Sent: Wednesday, October 09, 2013 6:33 PM To: user@pig.apache.org Subject: Re: Documentation bug in REGEX_EXTRACT Well, usually for regexp, 0 match is the whole match and groups start from 1. Are you sure you are getting group (the thing in brackets) with 0? 8 жовт. 2013 13:04, користувач "Steve Bernstein" <steve.bernst...@deem.com> написав: > Apologies if this is captured elsewhere. In the Pig 0.11.1 > documentation for the builtin function REGEX_EXTRACT ( > http://pig.apache.org/docs/r0.11.1/func.html#regex-extract), the third > parameter is the index of the matched group to return. The > documentation says this is a "1-based parameter". That's incorrect-it's > zero-based. > E.g., to get the first match instance I used: > REGEX_EXTRACT(string,'regex',0) > > >