Map tasks if you are using TableInputFormat will be equal to the number of regions in your table.
Region is the natural body of work for a Map task using hbase as a MR job source. If little data in your table, splitting this way makes little sense (You have one region only in your table, is that right?). You could force splits of your region to make more via the UI or shell? Otherwise, you need to make your own Splitter, one that has some knowledge of the key space and is able to partition on other than Region boundaries. See below... On Wed, Jun 16, 2010 at 10:36 PM, Raghava Mutharaju <[email protected]> wrote: > Hi all, > > I checked the size of the InputSplit in Map and it gave out 0. I was > expecting some number indicating the size of split in bytes, that this Map > has received. Is this normal behavior? > Where are you seeing this (so I can be sure I'm following along properly). St.Ack > Another issue I am having is even though I set the mapred.map.tasks to a > specific number (no of nodes*10), during execution, the no of map tasks is > always 1. I think this is related to the above issue. > > I am using HBase as the data source and sink. Previously, when I used HDFS > as data source, the no of map tasks were same as the one I used to set. I am > using HBase 0.20.4 > > Thank you. > > Regards, > Raghava. >
