Which version of pig? Are you using a special loader? I just tried with 8.1:
n = load 'tmp/numbers.txt' as (num:chararray); f = foreach n generate REGEX_EXTRACT($0, '(\\d)', 1); dump f; (1) (2) (3) (4) (5) -D On Wed, Jun 22, 2011 at 3:59 PM, Michael May <[email protected]> wrote: > Hello All, > > I'm having an issue where I get a 'ClassCastException: > org.apache.pig.data.DataByteArray cannot be cast to java.lang.String' when > passing in something of type chararray to REGEX_EXTRACT. > > e.g. > A = load '/path/to/some/data' .... > where A has a schema of something like ( f1:chararray, .... ) > > B = foreach A generate REGEX_EXTRACT( f1, <the regex>, 1 ) as > regex_extract; > > This gives me the above error. > > Now, the kicker is that if f1 is of type bytearray, (i.e. the schema is ( > f1:bytearray, ..... ) this works as expected. > > > What gives? Am I using REGEX_EXTRACT wrong? Is this a bug? > My understanding is that chararray is supposed to be used for things that > are Strings, which is why I find the 'cannot cast to String' exception a bit > funky. I've looked through the REGEX_EXTRACT source and looked over the > JavaDoc's pertaining to DataTypes without being able to crack this. > > Any help and information is appreciated! > Thanks for you time, > > Michael
