That is a problem with using "," as the field delimiter.
PigStorage ends up splitting the whole record by the delimiter and the second field is also getting split. If you use some other delimiter for your data (eg,tab or ^A), it should work fine.


Thanks,
Thejas

On 1/26/12 7:31 AM, Sandopolus wrote:
Hi there

I am trying to load in some data using the PigStorage with a schema. But i
can't seem to get the schema right and was hoping someone could point out
my mistake.

Here is the data being loaded in:
2ec00769-dc02-47dc-b2a5-35b6fb1d8e90,{(customer,27651a7d-0871-49a6-8df4-90305f7e840b),(customerClient,b57f9d15-6de7-486b-9761-46246be4abfe),(clientBuild,7376807c-7448-4785-8e2c-49814f6ce2f9),(country,FR)}

Commands used:
A = LOAD 'testdata.txt' USING PigStorage(',') as (key:chararray,
columns:bag {column:tuple (name:chararray, value:chararray)});
DUMP A;

This results in the following warning and output:
2012-01-26 15:27:51,860 [main] WARN
  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).

(2ec00769-dc02-47dc-b2a5-35b6fb1d8e90,)

 From the output it doesn't seem to be picking up bag structure, but if i
remove the schema it will dump the data out correctly.
Any help would be much appreciated.

Ta

Sandy


Reply via email to