I'm on Pig 0.8.0.

I am using a custom loader that is extending LoadFunc and implementing 
LoadMetaData. I think my custom loader is essentially attempting to do what 
PigStorageSchema does in PiggyBank. 
After reading through the PigStorageSchema source it was pretty obvious that I 
had overlooked several things in my implementation.  I'm going to go ahead and 
try to use PigStorageSchema.

Thanks for the help,
Michael 
  
On Jun 23, 2011, at 3:35 AM, Dmitriy Ryaboy wrote:

> Which version of pig? Are you using a special loader?
> I just tried with 8.1:
> 
> n = load 'tmp/numbers.txt' as (num:chararray);
> f = foreach n generate REGEX_EXTRACT($0, '(\\d)', 1);
> dump f;
> (1)
> (2)
> (3)
> (4)
> (5)
> 
> 
> -D
> 
> On Wed, Jun 22, 2011 at 3:59 PM, Michael May <[email protected]> wrote:
> 
>> Hello All,
>> 
>> I'm having an issue where I get a 'ClassCastException:
>> org.apache.pig.data.DataByteArray cannot be cast to java.lang.String' when
>> passing in something of type chararray to REGEX_EXTRACT.
>> 
>> e.g.
>> A = load '/path/to/some/data' ....
>> where A has a schema of something like ( f1:chararray, .... )
>> 
>> B = foreach A generate REGEX_EXTRACT( f1, <the regex>, 1 ) as
>> regex_extract;
>> 
>> This gives me the above error.
>> 
>> Now, the kicker is that if f1 is of type bytearray, (i.e. the schema is (
>> f1:bytearray, ..... ) this works as expected.
>> 
>> 
>> What gives? Am I using REGEX_EXTRACT wrong? Is this a bug?
>> My understanding is that chararray is supposed to be used for things that
>> are Strings, which is why I find the 'cannot cast to String' exception a bit
>> funky. I've looked through the REGEX_EXTRACT source and looked over the
>> JavaDoc's pertaining to DataTypes without being able to crack this.
>> 
>> Any help and information is appreciated!
>> Thanks for you time,
>> 
>> Michael

Reply via email to