Is the hash code of that key is negative.? Do something like this return groupKey.hashCode() & Integer.MAX_VALUE % numParts;
Regards, Som Shekhar Sharma +91-8197243810 On Fri, Aug 30, 2013 at 6:25 AM, Adeel Qureshi <[email protected]> wrote: > okay so when i specify the number of reducers e.g. in my example i m using 4 > (for a much smaller data set) it works if I use a single column in my > composite key .. but if I add multiple columns in the composite key > separated by a delimi .. it then throws the illegal partition error (keys > before the pipe are group keys and after the pipe are the sort keys and my > partioner only uses the group keys > > java.io.IOException: Illegal partition for Atlanta:GA|Atlanta:GA:1:Adeel > (-1) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1073) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at com.att.hadoop.hivesort.HSMapper.map(HSMapper.java:39) > at com.att.hadoop.hivesort.HSMapper.map(HSMapper.java:1) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > > > public int getPartition(Text key, HCatRecord record, int numParts) { > //extract the group key from composite key > String groupKey = key.toString().split("\\|")[0]; > return groupKey.hashCode() % numParts; > } > > > On Thu, Aug 29, 2013 at 8:31 PM, Shekhar Sharma <[email protected]> > wrote: >> >> No...partitionr decides which keys should go to which reducer...and >> number of reducers you need to decide...No of reducers depends on >> factors like number of key value pair, use case etc >> Regards, >> Som Shekhar Sharma >> +91-8197243810 >> >> >> On Fri, Aug 30, 2013 at 5:54 AM, Adeel Qureshi <[email protected]> >> wrote: >> > so it cant figure out an appropriate number of reducers as it does for >> > mappers .. in my case hadoop is using 2100+ mappers and then only 1 >> > reducer >> > .. since im overriding the partitioner class shouldnt that decide how >> > manyredeucers there should be based on how many different partition >> > values >> > being returned by the custom partiotioner >> > >> > >> > On Thu, Aug 29, 2013 at 7:38 PM, Ian Wrigley <[email protected]> wrote: >> >> >> >> If you don't specify the number of Reducers, Hadoop will use the >> >> default >> >> -- which, unless you've changed it, is 1. >> >> >> >> Regards >> >> >> >> Ian. >> >> >> >> On Aug 29, 2013, at 4:23 PM, Adeel Qureshi <[email protected]> >> >> wrote: >> >> >> >> I have implemented secondary sort in my MR job and for some reason if i >> >> dont specify the number of reducers it uses 1 which doesnt seems right >> >> because im working with 800M+ records and one reducer slows things down >> >> significantly. Is this some kind of limitation with the secondary sort >> >> that >> >> it has to use a single reducer .. that kind of would defeat the purpose >> >> of >> >> having a scalable solution such as secondary sort. I would appreciate >> >> any >> >> help. >> >> >> >> Thanks >> >> Adeel >> >> >> >> >> >> >> >> --- >> >> Ian Wrigley >> >> Sr. Curriculum Manager >> >> Cloudera, Inc >> >> Cell: (323) 819 4075 >> >> >> > > >
