I do not know if this is it, but I am not sure that pig likes it when you use the result variable in its own declaration. That is to say, try doing rows2 = Foreach rows generate etc.
2011/4/3 Mark <static.void....@gmail.com> > I have a simple EvalFunc as so: > > public class Set extends EvalFunc<Tuple> { > public Tuple exec(Tuple tuple) throws IOException { > Set<Object> unique = new HashSet<Object>(); > unique.addAll(tuple.getAll()); > return TupleFactory.getInstance().newTuple(unique); > } > } > > How can I apply this to a result set though? When I try: > > rows = LOAD 'foo'; > rows = FOREACH rows GENERATE com.mycompany.piggybank.Set(rows); > 2011-04-03 09:16:25,423 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 1000: Error during parsing. Scalars can be only used with projections > > I get the above error? Should I be using something other than a EvalFunc? > > Thanks > > > > On 4/3/11 8:53 AM, Bill Graham wrote: > >> You could add all the values to a set in a udf and the return it's >> contents. >> >> On Sunday, April 3, 2011, Mark<static.void....@gmail.com> wrote: >> >>> If I have a tuple of values, is there a way to eliminate duplicate values >>> per tuple? >>> >>> Example: >>> (5,5,4,7,2,3,4,9) = (5,4,7,2,3,9) >>> >>> Thanks >>> >>> >>> >>> >>>