Re: Seeing DataByteArray values for chararray field in 0.8.0

Thejas M Nair Fri, 21 Jan 2011 11:41:29 -0800


On 1/19/11 3:18 AM, "Kaluskar, Sanjay" <[email protected]> wrote:

> I have script as follows:
> 
> 
> 
> register lookup.jar;
> 
> a = load 'lookupfile.dat' as(emp_id: chararray);
> 
> b = foreach a generate flatten(com.mycompany.pig.lookup());

The udf in above statement does not have an argument, I assume you meant -
"b = foreach a generate flatten(com.mycompany.pig.lookup(emp_id));"

> My UDF works as expected in versions 0.5.0, 0.6.0 and 0.7.0. In version
> 0.8.0, I notice that the input tuple "input" has 1 field with value of
> type DataByteArray, whereas in earlier versions the value is of type
> String (as expected). Why is this different? I am assuming this is an
> intentional change in 0.8.0. Is there some way to force conversion from
> the raw data before the UDF is invoked, i.e., the old behaviour? What is
> the recommended approach in 0.8.0 for EvalFunc UDFs?

The tuple should contain field of type CHARARRAY in 0.8 as well. I looked at
the explain plan of a similar query and it seemed to be correct.
Can you please open a jira and attach a simplified form of your udf that
reproduces this problem ?


Thanks,
Thejas

Re: Seeing DataByteArray values for chararray field in 0.8.0

Reply via email to