JG, In one of your above replies , you have said that datalocality was not considered in older versions of HBase , Is there any development on the same in 0.20 RC1/2 or 0.19.x ? If no can you tell me where that patch can be available so that i can test my programs .
Thanks in advance On Sat, Aug 22, 2009 at 12:12 AM, Jonathan Gray <[email protected]> wrote: > I really couldn't be specific. > > The more data that has to be moved across the wire, the more network i/o. > > For example, if you have very large values, and a very large table, and you > have that as the input to your MR. You could potentially be network i/o > bound. > > It should be very easy to test how your own jobs run on your own cluster > using Ganglia and hadoop/mr logging/output. > > > bharath vissapragada wrote: > >> JG >> >> Can you please elaborate on the last statement "for some".. by giving an >> example or some kind of scenario in which it can take place where MR jobs >> involve huge amount of data. >> >> Thanks. >> >> On Fri, Aug 21, 2009 at 11:24 PM, Jonathan Gray <[email protected]> >> wrote: >> >> Ryan, >>> >>> In older versions of HBase, when we did not attempt any data locality, we >>> had a few users running jobs that became network i/o bound. It wasn't a >>> latency issue it was a bandwidth issue. >>> >>> That's actually when/why an attempt at better data locality for HBase MR >>> was made in the first place. >>> >>> I hadn't personally experienced it but I recall two users who had. After >>> they made a first-stab patch, I ran some comparisons and noticed a >>> significant reduction in network i/o for data-intensive MR jobs. They >>> also >>> were no longer network i/o bound on their jobs, if I recall, and became >>> disk >>> i/o bound (as one would expect/hope). >>> >>> For a majority of use cases, it doesn't matter in a significant way at >>> all. >>> But I have seen it make a measurable difference for some. >>> >>> JG >>> >>> >>> bharath vissapragada wrote: >>> >>> Thanks Ryan >>>> >>>> I was just explaining with an example .. I have TBs of data to work >>>> with.Just i wanted to know that scheduler TRIES to assign the reduce >>>> phase >>>> to keep the data local (i.e.,TRYING to assign it to the machine with >>>> machine with greater num of key values). >>>> I was just explaining it with an example . >>>> >>>> Thanks for ur reply (following u on twitter :)) >>>> >>>> On Fri, Aug 21, 2009 at 12:13 PM, Ryan Rawson <[email protected]> >>>> wrote: >>>> >>>> hey, >>>> >>>>> Yes the hadoop system attempts to assign map tasks to data local, but >>>>> why would you be worried about this for 5 values? The max value size >>>>> in hbase is Integer.MAX_VALUE, so it's not like you have much data to >>>>> shuffle. Once your blobs > ~ 64mb or so, it might make more sense to >>>>> use HDFS directly and keep only the metadata in hbase (including >>>>> things like location of the data blob). >>>>> >>>>> I think people are confused about how optimal map reduces have to be. >>>>> Keeping all the data super-local on each machine is not always helping >>>>> you, since you have to read via a socket anyways. Going remote doesn't >>>>> actually make things that much slower, since on a modern lan ping >>>>> times are < 0.1ms. If your entire cluster is hanging off a single >>>>> switch, there is nearly unlimited bandwidth between all nodes >>>>> (certainly much higher than any single system could push). Only once >>>>> you go multi-switch then switch-locality (aka rack locality) becomes >>>>> important. >>>>> >>>>> Remember, hadoop isn't about the instantaneous speed of any job, but >>>>> about running jobs in a highly scalable manner that works on tens or >>>>> tens of thousands of nodes. You end up blocking on single machine >>>>> limits anyways, and the r=3 of HDFS helps you transcend a single >>>>> machine read speed for large files. Keeping the data transfer local in >>>>> this case results in lower performance. >>>>> >>>>> If you want max local speed, I suggest looking at CUDA. >>>>> >>>>> >>>>> On Thu, Aug 20, 2009 at 9:09 PM, bharath >>>>> vissapragada<[email protected]> wrote: >>>>> >>>>> Aamandeep , Gray and Purtell thanks for your replies .. I have found >>>>>> them >>>>>> very useful. >>>>>> >>>>>> You said to increase the number of reduce tasks . Suppose the number >>>>>> of >>>>>> reduce tasks is more than number of distinct map output keys , some of >>>>>> >>>>>> the >>>>> >>>>> reduce processes may go waste ? is that the case? >>>>>> >>>>>> Also I have one more doubt ..I have 5 values for a corresponding key >>>>>> on >>>>>> >>>>>> one >>>>> >>>>> region and other 2 values on 2 different region servers. >>>>>> Does hadoop Map reduce take care of moving these 2 diff values to the >>>>>> >>>>>> region >>>>> >>>>> with 5 values instead of moving those 5 values to other system to >>>>>> >>>>>> minimize >>>>> >>>>> the dataflow? Is this what is happening inside ? >>>>>> >>>>>> On Fri, Aug 21, 2009 at 9:03 AM, Andrew Purtell <[email protected]> >>>>>> >>>>>> wrote: >>>>> >>>>> The behavior of TableInputFormat is to schedule one mapper for every >>>>>> table >>>>>> region. >>>>>> >>>>>>> In addition to what others have said already, if your reducer is >>>>>>> doing >>>>>>> little more than storing data back into HBase (via >>>>>>> TableOutputFormat), >>>>>>> >>>>>>> then >>>>>> you can consider writing results back to HBase directly from the >>>>>> mapper >>>>>> to >>>>>> avoid incurring the overhead of sort/shuffle/merge which happens >>>>>> within >>>>>> the >>>>>> Hadoop job framework as map outputs are input into reducers. For that >>>>>> type >>>>>> of use case -- using the Hadoop mapreduce subsystem as essentially a >>>>>> grid >>>>>> scheduler -- something like job.setNumReducers(0) will do the trick. >>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> - Andy >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> ________________________________ >>>>>>> From: john smith <[email protected]> >>>>>>> To: [email protected] >>>>>>> Sent: Friday, August 21, 2009 12:42:36 AM >>>>>>> Subject: Doubt in HBase >>>>>>> >>>>>>> Hi all , >>>>>>> >>>>>>> I have one small doubt . Kindly answer it even if it sounds silly. >>>>>>> >>>>>>> Iam using Map Reduce in HBase in distributed mode . I have a table >>>>>>> >>>>>>> which >>>>>> spans across 5 region servers . I am using TableInputFormat to read >>>>>> the >>>>>> >>>>>>> data >>>>>>> from the tables in the map . When i run the program , by default how >>>>>>> >>>>>>> many >>>>>> map regions are created ? Is it one per region server or more ? >>>>>> >>>>>>> Also after the map task is over.. reduce task is taking a bit more >>>>>>> time >>>>>>> >>>>>>> . >>>>>> Is >>>>>> >>>>>>> it due to moving the map output across the regionservers? i.e, moving >>>>>>> >>>>>>> the >>>>>> values of same key to a particular reduce phase to start the reducer? >>>>>> Is >>>>>> >>>>>>> there any way i can optimize the code (e.g. by storing data of same >>>>>>> >>>>>>> reducer >>>>>> nearby ) >>>>>> >>>>>>> Thanks :) >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>
