Re: Occasional regionserver crashes following socket errors writing to HDFS

Michael Segel Thu, 10 May 2012 12:51:17 -0700

Eran,

see my response inline...


On May 10, 2012, at 2:17 PM, Eran Kutner wrote:

> Michale I appreciate the feedback but I'd have to disagree.
> In my case for example, I need to look at a complete set of data produced
> by the map phase in order to make a decision and write it to Hbase. So sure
> I could write all the mappers output to hbase then have another map only
> job to scan the output of the previous one do the calculation then write
> the output to another table. I don't really see why would that be better
> than using a reducer.

You disagree without actually benchmarking the two? 
That's pretty bold. :-) 

2 things. 
First Reducers are expensive.
Second, writing sorted records in to HBase is also more expensive than if 
you're writing records in random order. 

Here's a caveat. I don't know what you're attempting to do, so I can only say 
in general, I've found it faster to write 2 mappers and avoid using reducers.

> As for the other tips, I agree the files are too large, so I increased the
> file size, but I don't really see why is that relevant to the error we're
> talking about. Why having many regions cause timeouts on HDFS?
> I do have mslabs configured and GC tuneups.
> I do run multiple reducers, I suspect that's aggravating the problem not
> helping it.
> As far as I can tell dfs.balance.bandwidthPerSec is relevant only for
> balancing done with the balancer, not for the initial replication.
> 
> 
With respect to the number of regions... you'd probably get a better answer 
St.Ack or JD.

With respect to the bandwidth issue... 
We set it higher to something like 10% of the available pipe. Not that its 
going to be used all the time, but the smaller the pipe, the longer it takes to 
copy a file from one node to another.
How much of an impact it has on your performance... Not sure. But its always 
something to check and think about.  

BTW, I did a quick read on your problem. You didn't say which release/version 
of HBase you were running....
 

> -eran
> 
> 
> 
> On Thu, May 10, 2012 at 9:59 PM, Michael Segel 
> <michael_se...@hotmail.com>wrote:
> 
>> Sigh.
>> 
>> Dave,
>> I really think you need to think more about the problem.
>> 
>> Think about what a reduce does and then think about what happens in side
>> of HBase.
>> 
>> Then think about which runs faster... a job with two mappers writing the
>> intermediate and final results in HBase,
>> or a M/R job that writes its output to HBase.
>> 
>> If you really truly think about the problem, you will start to understand
>> why I say you really don't want to use a reducer when you're working w
>> HBase.
>> 
>> 
>> On May 10, 2012, at 1:41 PM, Dave Revell wrote:
>> 
>>> Some examples of when you'd want a reducer:
>>> http://static.usenix.org/event/osdi04/tech/full_papers/dean/dean.pdf
>>> 
>>> On Thu, May 10, 2012 at 11:30 AM, Michael Segel
>>> <michael_se...@hotmail.com>wrote:
>>> 
>>>> Dave, do you really want to go there?
>>>> 
>>>> OP has a couple of issues and he was going down a rabbit hole.
>>>> (You can choose if that's a reference to 'the Matrix, Jefferson
>> Starship,
>>>> Alice in Wonderland... or all of the above)
>>>> 
>>>> So to put him on the correct path, I recommended the following, not in
>> any
>>>> order...
>>>> 
>>>> 1) Increase his region size for this table only.
>>>> 2) Look to decreasing the number of regions managed by a RS (which is
>> why
>>>> you increase region size)
>>>> 3) Up the dfs.balance.bandwidthPerSec. (How often does HBase move
>> regions
>>>> and how exactly do they move regions ?)
>>>> 4) Look at implementing MSLABS and GC tuning. This cuts down on the
>>>> overhead.
>>>> 5) Refactoring his job....
>>>> 
>>>> Oops.
>>>> Ok I didn't put that in the list.
>>>> But that was the last thing I wrote as a separate statement.
>>>> Clearly you didn't take my advice and think about the problem....
>>>> 
>>>> To prove a point.... you wrote:
>>>> 'Many mapreduce algorithms require a reduce phase (e.g. sorting)'
>>>> 
>>>> Ok. So tell me why you would want to sort your input in to HBase and if
>>>> that's really a good thing?
>>>> Oops!... :-)
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On May 10, 2012, at 12:31 PM, Dave Revell wrote:
>>>>> This "you don't need a reducer" conversation is distracting from the
>> real
>>>>> problem and is false.
>>>>> 
>>>>> Many mapreduce algorithms require a reduce phase (e.g. sorting). The
>> fact
>>>>> that the output is written to HBase or somewhere else is irrelevant.
>>>>> 
>>>>> -Dave
>>>>> 
>>>>> On Thu, May 10, 2012 at 6:26 AM, Michael Segel <
>>>> michael_se...@hotmail.com>wrote:
>>>>> [SNIP]
>>>> 
>>>> 
>> 
>>

Re: Occasional regionserver crashes following socket errors writing to HDFS

Reply via email to