hey, Yes the hadoop system attempts to assign map tasks to data local, but why would you be worried about this for 5 values? The max value size in hbase is Integer.MAX_VALUE, so it's not like you have much data to shuffle. Once your blobs > ~ 64mb or so, it might make more sense to use HDFS directly and keep only the metadata in hbase (including things like location of the data blob).
I think people are confused about how optimal map reduces have to be. Keeping all the data super-local on each machine is not always helping you, since you have to read via a socket anyways. Going remote doesn't actually make things that much slower, since on a modern lan ping times are < 0.1ms. If your entire cluster is hanging off a single switch, there is nearly unlimited bandwidth between all nodes (certainly much higher than any single system could push). Only once you go multi-switch then switch-locality (aka rack locality) becomes important. Remember, hadoop isn't about the instantaneous speed of any job, but about running jobs in a highly scalable manner that works on tens or tens of thousands of nodes. You end up blocking on single machine limits anyways, and the r=3 of HDFS helps you transcend a single machine read speed for large files. Keeping the data transfer local in this case results in lower performance. If you want max local speed, I suggest looking at CUDA. On Thu, Aug 20, 2009 at 9:09 PM, bharath vissapragada<[email protected]> wrote: > Aamandeep , Gray and Purtell thanks for your replies .. I have found them > very useful. > > You said to increase the number of reduce tasks . Suppose the number of > reduce tasks is more than number of distinct map output keys , some of the > reduce processes may go waste ? is that the case? > > Also I have one more doubt ..I have 5 values for a corresponding key on one > region and other 2 values on 2 different region servers. > Does hadoop Map reduce take care of moving these 2 diff values to the region > with 5 values instead of moving those 5 values to other system to minimize > the dataflow? Is this what is happening inside ? > > On Fri, Aug 21, 2009 at 9:03 AM, Andrew Purtell <[email protected]> wrote: > >> The behavior of TableInputFormat is to schedule one mapper for every table >> region. >> >> In addition to what others have said already, if your reducer is doing >> little more than storing data back into HBase (via TableOutputFormat), then >> you can consider writing results back to HBase directly from the mapper to >> avoid incurring the overhead of sort/shuffle/merge which happens within the >> Hadoop job framework as map outputs are input into reducers. For that type >> of use case -- using the Hadoop mapreduce subsystem as essentially a grid >> scheduler -- something like job.setNumReducers(0) will do the trick. >> >> Best regards, >> >> - Andy >> >> >> >> >> ________________________________ >> From: john smith <[email protected]> >> To: [email protected] >> Sent: Friday, August 21, 2009 12:42:36 AM >> Subject: Doubt in HBase >> >> Hi all , >> >> I have one small doubt . Kindly answer it even if it sounds silly. >> >> Iam using Map Reduce in HBase in distributed mode . I have a table which >> spans across 5 region servers . I am using TableInputFormat to read the >> data >> from the tables in the map . When i run the program , by default how many >> map regions are created ? Is it one per region server or more ? >> >> Also after the map task is over.. reduce task is taking a bit more time . >> Is >> it due to moving the map output across the regionservers? i.e, moving the >> values of same key to a particular reduce phase to start the reducer? Is >> there any way i can optimize the code (e.g. by storing data of same reducer >> nearby ) >> >> Thanks :) >> >> >> >> >
