I'm on Pig 0.8.0. I am using a custom loader that is extending LoadFunc and implementing LoadMetaData. I think my custom loader is essentially attempting to do what PigStorageSchema does in PiggyBank. After reading through the PigStorageSchema source it was pretty obvious that I had overlooked several things in my implementation. I'm going to go ahead and try to use PigStorageSchema.
Thanks for the help, Michael On Jun 23, 2011, at 3:35 AM, Dmitriy Ryaboy wrote: > Which version of pig? Are you using a special loader? > I just tried with 8.1: > > n = load 'tmp/numbers.txt' as (num:chararray); > f = foreach n generate REGEX_EXTRACT($0, '(\\d)', 1); > dump f; > (1) > (2) > (3) > (4) > (5) > > > -D > > On Wed, Jun 22, 2011 at 3:59 PM, Michael May <[email protected]> wrote: > >> Hello All, >> >> I'm having an issue where I get a 'ClassCastException: >> org.apache.pig.data.DataByteArray cannot be cast to java.lang.String' when >> passing in something of type chararray to REGEX_EXTRACT. >> >> e.g. >> A = load '/path/to/some/data' .... >> where A has a schema of something like ( f1:chararray, .... ) >> >> B = foreach A generate REGEX_EXTRACT( f1, <the regex>, 1 ) as >> regex_extract; >> >> This gives me the above error. >> >> Now, the kicker is that if f1 is of type bytearray, (i.e. the schema is ( >> f1:bytearray, ..... ) this works as expected. >> >> >> What gives? Am I using REGEX_EXTRACT wrong? Is this a bug? >> My understanding is that chararray is supposed to be used for things that >> are Strings, which is why I find the 'cannot cast to String' exception a bit >> funky. I've looked through the REGEX_EXTRACT source and looked over the >> JavaDoc's pertaining to DataTypes without being able to crack this. >> >> Any help and information is appreciated! >> Thanks for you time, >> >> Michael
