Hi, As each row of my hbase table can take a lot of time to process (waiting on answeres from other hosts), I would like to create a few threads to process that data in parallel. I would then use the last call to the map function to wait for all threads to finish their job and only return the last call to the map function when everything is done and all threads exited.
How do I know when the last row is passed to my mapper function? (I'm extending TableMap for my mapper as also done in the wiki examples) I didn't find any function to check this. Another possibility would be to create more mapper jobs and let the hadoop framework do the processing in parallel. However I read somewhere that each mapper get's an entire region. In my case, the data in each row is very small, so each mapper could get millions of rows (with the default region/block size). What would you do? Thanks, Thibaut -- View this message in context: http://www.nabble.com/How-to-detect-when-the-mapper-is-called-the-last-time--tp20528861p20528861.html Sent from the HBase User mailing list archive at Nabble.com.
