yup it was negative and by doing this now it seems to be working fine
On Fri, Aug 30, 2013 at 3:09 AM, Shekhar Sharma <[email protected]>wrote: > Is the hash code of that key is negative.? > Do something like this > > return groupKey.hashCode() & Integer.MAX_VALUE % numParts; > > Regards, > Som Shekhar Sharma > +91-8197243810 > > > On Fri, Aug 30, 2013 at 6:25 AM, Adeel Qureshi <[email protected]> > wrote: > > okay so when i specify the number of reducers e.g. in my example i m > using 4 > > (for a much smaller data set) it works if I use a single column in my > > composite key .. but if I add multiple columns in the composite key > > separated by a delimi .. it then throws the illegal partition error (keys > > before the pipe are group keys and after the pipe are the sort keys and > my > > partioner only uses the group keys > > > > java.io.IOException: Illegal partition for Atlanta:GA|Atlanta:GA:1:Adeel > > (-1) > > at > > > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1073) > > at > > > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) > > at > > > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > > at com.att.hadoop.hivesort.HSMapper.map(HSMapper.java:39) > > at com.att.hadoop.hivesort.HSMapper.map(HSMapper.java:1) > > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > > at > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:396) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) > > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > > > > public int getPartition(Text key, HCatRecord record, int numParts) { > > //extract the group key from composite key > > String groupKey = key.toString().split("\\|")[0]; > > return groupKey.hashCode() % numParts; > > } > > > > > > On Thu, Aug 29, 2013 at 8:31 PM, Shekhar Sharma <[email protected]> > > wrote: > >> > >> No...partitionr decides which keys should go to which reducer...and > >> number of reducers you need to decide...No of reducers depends on > >> factors like number of key value pair, use case etc > >> Regards, > >> Som Shekhar Sharma > >> +91-8197243810 > >> > >> > >> On Fri, Aug 30, 2013 at 5:54 AM, Adeel Qureshi <[email protected]> > >> wrote: > >> > so it cant figure out an appropriate number of reducers as it does for > >> > mappers .. in my case hadoop is using 2100+ mappers and then only 1 > >> > reducer > >> > .. since im overriding the partitioner class shouldnt that decide how > >> > manyredeucers there should be based on how many different partition > >> > values > >> > being returned by the custom partiotioner > >> > > >> > > >> > On Thu, Aug 29, 2013 at 7:38 PM, Ian Wrigley <[email protected]> > wrote: > >> >> > >> >> If you don't specify the number of Reducers, Hadoop will use the > >> >> default > >> >> -- which, unless you've changed it, is 1. > >> >> > >> >> Regards > >> >> > >> >> Ian. > >> >> > >> >> On Aug 29, 2013, at 4:23 PM, Adeel Qureshi <[email protected]> > >> >> wrote: > >> >> > >> >> I have implemented secondary sort in my MR job and for some reason > if i > >> >> dont specify the number of reducers it uses 1 which doesnt seems > right > >> >> because im working with 800M+ records and one reducer slows things > down > >> >> significantly. Is this some kind of limitation with the secondary > sort > >> >> that > >> >> it has to use a single reducer .. that kind of would defeat the > purpose > >> >> of > >> >> having a scalable solution such as secondary sort. I would appreciate > >> >> any > >> >> help. > >> >> > >> >> Thanks > >> >> Adeel > >> >> > >> >> > >> >> > >> >> --- > >> >> Ian Wrigley > >> >> Sr. Curriculum Manager > >> >> Cloudera, Inc > >> >> Cell: (323) 819 4075 > >> >> > >> > > > > > >
