How many mappers and reducers do you have? Skimming the Rank code it looks like it creates at least N counters per task which would be a scalability bug.
On Friday, April 5, 2013, Lauren Blau wrote: > this is defintely caused by the RANK operator. Is there some way to reduce > the number of counters generated by this operator when working with large > data? > thanks > > On Thu, Apr 4, 2013 at 7:01 PM, Lauren Blau < > [email protected] <javascript:;>> wrote: > > > I can think of only 2 things that have changed since this script last ran > > successfully. Switched to using the range specification of the schema for > > a2, and the input data has grown considerably. > > > > Lauren > > > > > > On Thu, Apr 4, 2013 at 7:00 PM, Lauren Blau < > > [email protected] <javascript:;>> wrote: > > > >> no > >> > >> > >> On Thu, Apr 4, 2013 at 4:54 PM, Dmitriy Ryaboy > >> <[email protected]<javascript:;> > >wrote: > >> > >>> Do you have any special properties set? > >>> Like the pig.udf.profile one maybe.. > >>> D > >>> > >>> > >>> On Thu, Apr 4, 2013 at 6:25 AM, Lauren Blau < > >>> [email protected] <javascript:;>> wrote: > >>> > >>> > I'm running a simple script to add a sequence_number to a relation, > >>> sort > >>> > the result and store to a file: > >>> > > >>> > a0 = load '<filename>' using PigStorage('\t','-schema'); > >>> > a1 = rank a0; > >>> > a2 = foreach a1 generate col1 .. col16 , rank_a0 as sequence_number; > >>> > a3 = order a2 by sequence_number; > >>> > store a3 into 'outputfile' using PigStorage('\t','-schema'); > >>> > > >>> > I get the following error: > >>> > org.apache.hadoop.mapreduce.counters.LimitExceededException: Too many > >>> > counters: 241 max=240 > >>> > at > >>> > > >>> > org.apache.hadoop.mapreduce.counters.Limits.checkCounters(Limits.java:61) > >>> > at > >>> > > >>> > org.apache.hadoop.mapreduce.counters.Limits.incrCounters(Limits.java:68) > >>> > at > >>> > > >>> > > >>> > org.apache.hadoop.mapreduce.counters.AbstractCounterGroup.readFields(AbstractCounterGroup.java:174) > >>> > at > >>> > org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java:278) > >>> > at > >>> > > >>> > > >>> > org.apache.hadoop.mapreduce.counters.AbstractCounters.readFields(AbstractCounters.java:303) > >>> > at > >>> > > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280) > >>> > at > >>> > > org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75) > >>> > at > >>> > > >>> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:951) > >>> > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:835) > >>> > > >>> > > >>> > we aren't able to up our counters any higher (policy) and I don't > >>> > understand why I should need so many counters for such a simple > script > >>> > anyway? > >>> > running Apache Pig version 0.11.1-SNAPSHOT (r: unknown) > >>> > compiled Mar 22 2013, 10:19:19 > >>> > > >>> > Can someone help? > >>> > > >>> > Thanks, > >>> > Lauren > >>> > > >>> > >> > >> > > > -- Sent from Gmail Mobile
