I do not know if this is it, but I am not sure that pig likes it when you
use the result variable in its own declaration. That is to say, try doing
rows2 = Foreach rows generate etc.

2011/4/3 Mark <static.void....@gmail.com>

> I have a simple EvalFunc as so:
>
> public class Set extends EvalFunc<Tuple> {
>  public Tuple exec(Tuple tuple) throws IOException {
>    Set<Object> unique = new HashSet<Object>();
>    unique.addAll(tuple.getAll());
>    return TupleFactory.getInstance().newTuple(unique);
>  }
> }
>
> How can I apply this to a result set though?  When I try:
>
> rows = LOAD 'foo';
> rows = FOREACH rows GENERATE com.mycompany.piggybank.Set(rows);
> 2011-04-03 09:16:25,423 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1000: Error during parsing. Scalars can be only used with projections
>
> I get the above error? Should I be using something other than a EvalFunc?
>
> Thanks
>
>
>
> On 4/3/11 8:53 AM, Bill Graham wrote:
>
>> You could add all the values to a set in a udf and the return it's
>> contents.
>>
>> On Sunday, April 3, 2011, Mark<static.void....@gmail.com>  wrote:
>>
>>> If I have a tuple of values, is there a way to eliminate duplicate values
>>> per tuple?
>>>
>>> Example:
>>> (5,5,4,7,2,3,4,9) = (5,4,7,2,3,9)
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>

Reply via email to