I agree with Prashant. I am hard pressed to find a case where it would be
useful, and I would much rather it fail on parse than while running.

2012/2/15 Prashant Kommireddi <[email protected]>

> AVG over chararrays is not a usual case, simply because it does not make
> sense in most cases. For eg, what would be the average if it were a bag of
> first or last names? AVG would fail if it tried to convert String to
> Integer or Double.
>
> In your case its the best to declare it int/long if you know the data type
> beforehand.
>
> Thanks,
> Prashant
>
> 2012/2/15 Haitao Yao <[email protected]>
>
> > I solve this problem by extending the build in AVG function to accept
> char
> > array bag as input and calculate the result.
> >
> > why the build-in AVG can not accept the char array bag and convert the
> > value to double and calculate the result?
> >
> >
> >
> > 在 2012-2-15,下午4:04, Jonathan Coveney 写道:
> >
> > > the issue is that doing (int)b.x does not cast each column to an int,
> but
> > > rather, it tries to cast the bag itself. Short of flattening out the
> bag
> > > and projecting it as an int, which is inefficient, I suppose you could
> > make
> > > a UDF that calculate the Average of chararrays by casting to an
> int...but
> > > then that raises the question of why you couldn't just load it as an
> > x:int
> > > in the first place.
> > >
> > > So generally, you need to do something like "foreach rel generate
> > (int)x".
> > > In this case that doesn't work as efficiently, but this is kind of a
> > weird
> > > case.
> > >
> > > 2012/2/14 Haitao Yao <[email protected]>
> > >
> > >> hi, all
> > >>       here's my pig script:
> > >>
> > >> A = load 'input' as (b:bag{t:(x:int, y:int)});
> > >> B = foreach A generate AVG(b.x);
> > >> describe B;
> > >>
> > >> it works well.
> > >> if the b.x is char array, the problems arise:
> > >> A = load 'input' as (b:bag{t:(x:chararray, y:int)});
> > >> B = foreach A generate AVG((int)b.x);
> > >> 2012-02-15 14:17:17,937 [main] ERROR org.apache.pig.tools.grunt.Grunt
> -
> > >> ERROR 1052:
> > >> <line 4, column 28> Cannot cast bag with schema
> > :bag{:tuple(x:chararray)}
> > >> to int
> > >> Details at logfile: /tmp/pig_1329286634873.log
> > >>
> > >> Why?  How can I calculate the avg of b.x if b.x must be a chararray?
> > >>
> > >>
> > >> here's the running snapshot in Grunt:
> > >>
> > >> grunt> A = load 'input' as (b:bag{t:(x:int, y:int)});
> > >> grunt> B = foreach A generate AVG(b.x);
> > >> grunt> describe B;
> > >> B: {double}
> > >> grunt> A = load 'input' as (b:bag{t:(x:chararray, y:int)});
> > >> grunt> B = foreach A generate AVG((int)b.x);
> > >> 2012-02-15 14:17:17,937 [main] ERROR org.apache.pig.tools.grunt.Grunt
> -
> > >> ERROR 1052:
> > >> <line 4, column 28> Cannot cast bag with schema
> > :bag{:tuple(x:chararray)}
> > >> to int
> > >> Details at logfile: /tmp/pig_1329286634873.log
> > >> grunt>
> > >>
> > >> thanks.
> > >>
> > >>
> >
> >
>

Reply via email to