Re: Doubt in HBase

bharath vissapragada Fri, 21 Aug 2009 21:05:18 -0700

JG,

In one of your above replies , you have said that datalocality was not
considered in older versions of HBase , Is there  any development on the
same in 0.20 RC1/2 or 0.19.x ? If no can you tell me where that patch can be
available so that i can test my programs .


Thanks in advance

On Sat, Aug 22, 2009 at 12:12 AM, Jonathan Gray <[email protected]> wrote:

> I really couldn't be specific.
>
> The more data that has to be moved across the wire, the more network i/o.
>
> For example, if you have very large values, and a very large table, and you
> have that as the input to your MR.  You could potentially be network i/o
> bound.
>
> It should be very easy to test how your own jobs run on your own cluster
> using Ganglia and hadoop/mr logging/output.
>
>
> bharath vissapragada wrote:
>
>> JG
>>
>> Can you please elaborate on the last statement "for some".. by giving an
>> example or some kind of scenario in which it can take place where MR jobs
>> involve huge amount of data.
>>
>> Thanks.
>>
>> On Fri, Aug 21, 2009 at 11:24 PM, Jonathan Gray <[email protected]>
>> wrote:
>>
>>  Ryan,
>>>
>>> In older versions of HBase, when we did not attempt any data locality, we
>>> had a few users running jobs that became network i/o bound.  It wasn't a
>>> latency issue it was a bandwidth issue.
>>>
>>> That's actually when/why an attempt at better data locality for HBase MR
>>> was made in the first place.
>>>
>>> I hadn't personally experienced it but I recall two users who had. After
>>> they made a first-stab patch, I ran some comparisons and noticed a
>>> significant reduction in network i/o for data-intensive MR jobs.  They
>>> also
>>> were no longer network i/o bound on their jobs, if I recall, and became
>>> disk
>>> i/o bound (as one would expect/hope).
>>>
>>> For a majority of use cases, it doesn't matter in a significant way at
>>> all.
>>>  But I have seen it make a measurable difference for some.
>>>
>>> JG
>>>
>>>
>>> bharath vissapragada wrote:
>>>
>>>  Thanks Ryan
>>>>
>>>> I was just explaining with an example .. I have TBs of data to work
>>>> with.Just i wanted to know that scheduler TRIES to assign the reduce
>>>> phase
>>>> to keep the data local (i.e.,TRYING  to assign it to the machine with
>>>> machine with greater num of key values).
>>>> I was just explaining it with an example .
>>>>
>>>> Thanks for ur reply (following u on twitter :))
>>>>
>>>> On Fri, Aug 21, 2009 at 12:13 PM, Ryan Rawson <[email protected]>
>>>> wrote:
>>>>
>>>>  hey,
>>>>
>>>>> Yes the hadoop system attempts to assign map tasks to data local, but
>>>>> why would you be worried about this for 5 values?  The max value size
>>>>> in hbase is Integer.MAX_VALUE, so it's not like you have much data to
>>>>> shuffle. Once your blobs > ~ 64mb or so, it might make more sense to
>>>>> use HDFS directly and keep only the metadata in hbase (including
>>>>> things like location of the data blob).
>>>>>
>>>>> I think people are confused about how optimal map reduces have to be.
>>>>> Keeping all the data super-local on each machine is not always helping
>>>>> you, since you have to read via a socket anyways. Going remote doesn't
>>>>> actually make things that much slower, since on a modern lan ping
>>>>> times are < 0.1ms.  If your entire cluster is hanging off a single
>>>>> switch, there is nearly unlimited bandwidth between all nodes
>>>>> (certainly much higher than any single system could push).  Only once
>>>>> you go multi-switch then switch-locality (aka rack locality) becomes
>>>>> important.
>>>>>
>>>>> Remember, hadoop isn't about the instantaneous speed of any job, but
>>>>> about running jobs in a highly scalable manner that works on tens or
>>>>> tens of thousands of nodes. You end up blocking on single machine
>>>>> limits anyways, and the r=3 of HDFS helps you transcend a single
>>>>> machine read speed for large files. Keeping the data transfer local in
>>>>> this case results in lower performance.
>>>>>
>>>>> If you want max local speed, I suggest looking at CUDA.
>>>>>
>>>>>
>>>>> On Thu, Aug 20, 2009 at 9:09 PM, bharath
>>>>> vissapragada<[email protected]> wrote:
>>>>>
>>>>>  Aamandeep , Gray and Purtell thanks for your replies .. I have found
>>>>>> them
>>>>>> very useful.
>>>>>>
>>>>>> You said to increase the number of reduce tasks . Suppose the number
>>>>>> of
>>>>>> reduce tasks is more than number of distinct map output keys , some of
>>>>>>
>>>>>>  the
>>>>>
>>>>>  reduce processes may go waste ? is that the case?
>>>>>>
>>>>>> Also  I have one more doubt ..I have 5 values for a corresponding key
>>>>>> on
>>>>>>
>>>>>>  one
>>>>>
>>>>>  region  and other 2 values on 2 different region servers.
>>>>>> Does hadoop Map reduce take care of moving these 2 diff values to the
>>>>>>
>>>>>>  region
>>>>>
>>>>>  with 5 values instead of moving those 5 values to other system to
>>>>>>
>>>>>>  minimize
>>>>>
>>>>>  the dataflow? Is this what is happening inside ?
>>>>>>
>>>>>> On Fri, Aug 21, 2009 at 9:03 AM, Andrew Purtell <[email protected]>
>>>>>>
>>>>>>  wrote:
>>>>>
>>>>>  The behavior of TableInputFormat is to schedule one mapper for every
>>>>>> table
>>>>>> region.
>>>>>>
>>>>>>> In addition to what others have said already, if your reducer is
>>>>>>> doing
>>>>>>> little more than storing data back into HBase (via
>>>>>>> TableOutputFormat),
>>>>>>>
>>>>>>>  then
>>>>>> you can consider writing results back to HBase directly from the
>>>>>> mapper
>>>>>> to
>>>>>> avoid incurring the overhead of sort/shuffle/merge which happens
>>>>>> within
>>>>>> the
>>>>>> Hadoop job framework as map outputs are input into reducers. For that
>>>>>> type
>>>>>> of use case -- using the Hadoop mapreduce subsystem as essentially a
>>>>>> grid
>>>>>> scheduler -- something like job.setNumReducers(0) will do the trick.
>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>  - Andy
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ________________________________
>>>>>>> From: john smith <[email protected]>
>>>>>>> To: [email protected]
>>>>>>> Sent: Friday, August 21, 2009 12:42:36 AM
>>>>>>> Subject: Doubt in HBase
>>>>>>>
>>>>>>> Hi all ,
>>>>>>>
>>>>>>> I have one small doubt . Kindly answer it even if it sounds silly.
>>>>>>>
>>>>>>> Iam using Map Reduce in HBase in distributed mode .  I have a table
>>>>>>>
>>>>>>>  which
>>>>>> spans across 5 region servers . I am using TableInputFormat to read
>>>>>> the
>>>>>>
>>>>>>> data
>>>>>>> from the tables in the map . When i run the program , by default how
>>>>>>>
>>>>>>>  many
>>>>>> map regions are created ? Is it one per region server or more ?
>>>>>>
>>>>>>> Also after the map task is over.. reduce task is taking a bit more
>>>>>>> time
>>>>>>>
>>>>>>>  .
>>>>>> Is
>>>>>>
>>>>>>> it due to moving the map output across the regionservers? i.e, moving
>>>>>>>
>>>>>>>  the
>>>>>> values of same key to a particular reduce phase to start the reducer?
>>>>>> Is
>>>>>>
>>>>>>> there any way i can optimize the code (e.g. by storing data of same
>>>>>>>
>>>>>>>  reducer
>>>>>> nearby )
>>>>>>
>>>>>>> Thanks :)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>

Re: Doubt in HBase

Reply via email to