So looks like doing a List<Range> is what I need so that I can have a mapper per range. However, a more interesting scenario is one when given a big range I want to split it into multiple ranges. In other words if my rowid was 1_hello, 2_hello, .... 9_hello, 10_hello. And the range given was 2 to 5. But i want one mapper per integer so 4 mappers in this case... any ideas on how I can accomplish that?
Thanks all for suggestions. On Fri, Mar 8, 2013 at 7:02 PM, Keith Turner <[email protected]> wrote: > On Fri, Mar 8, 2013 at 4:17 PM, Aji Janis <[email protected]> wrote: > > Thank you. Follow up question. > > > > Would this enforce one mapper per range even if all the data (From three > > ranges) is on one node/tablet? > > Look at disableAutoAdjustRanges(). This determines wether it creates a > mapper per tablet per range OR per range. > > > > > > > > > > On Fri, Mar 8, 2013 at 1:17 PM, Mike Hugo <[email protected]> wrote: > >> > >> See AccumuloInputFormat > >> > >> ArrayList<Range> ranges = new ArrayList<Range>(); > >> // populate array list of row ranges ... > >> AccumuloInputFormat.setRanges(job, ranges); > >> > >> > >> You should get one mapper per range. > >> > >> > >> > >> > >> On Fri, Mar 8, 2013 at 12:11 PM, Aji Janis <[email protected]> wrote: > >>> > >>> Hello, > >>> > >>> I am trying to figure out how I can configure number of mappers (if > its > >>> even possible) based on a Accumulo row range. My accumulo rowid uses > the > >>> format: > >>> > >>> abc/1 > >>> abc/2 > >>> ... > >>> def/3 > >>> .... > >>> xyz/13... > >>> > >>> If I want to specify three ranges: [abc/1 to abc/3] , [def/1 to def 5] > , > >>> [jkl/13 to klm 15]. and have one mapper work on one range, is there a > way I > >>> can do this?? How do I even set up my mapreduce job to accept these > >>> ranges??? Thankyou for all feedback. > >>> > >>> > >> > > >
