Re: dfs.datanode.max.xcievers > 4k

Jean-Daniel Cryans Wed, 28 Sep 2011 10:01:56 -0700

Unless you have weird OS settings, everything should be fine. Like I
said in my first email, make sure you don't max out your number of
processes (it's a ulimit config).


J-D

On Tue, Sep 27, 2011 at 11:41 PM, Robert J Berger <[email protected]> wrote:
> Thanks again for the information. We're implementing it now.
>
> Just one last question (at least for a bit :-)
>
> If we bump up our dfs.datanode.max.xcievers from 4k to 8k what should we 
> watch for in terms of exhausting any system resources?
>
> We have the Heap Sizes set to:
>
> * DataNode -Xmx2000m
> * TaskTracker -Xmx2000m
> * RegionServer to -Xmx4000m
> * m1.xlarge EC2 Instances with 14GB of RAM.
>
> I'm thinking about removing the TaskTrackers and use non RegionServer based 
> instances for running just TaskTrackers when we need to do Map/Reduce.
>
> Just wondering what I should be monitoring or tweaking since the Datanode 
> could be doubling the number of Threads its running...
>
> Thanks!
> Rob
>
> On Sep 27, 2011, at 12:50 PM, Jean-Daniel Cryans wrote:
>
>> On Tue, Sep 27, 2011 at 12:31 PM, Robert J Berger <[email protected]> wrote:
>>> Its not enough. We're still having errors and it caused a regionserver to 
>>> shutdown again. No data loss but degraded service (Yay for robustness!)
>>
>> Yeah, just up those xcievers.
>>
>>>>
>>> I tend to be "conservative" (was going to say cowardly)  towards our HBase 
>>> cluster since its the persistent core of our application. So I'm going to 
>>> not worry about growing the hfile size on this system.
>>>>
>>> Not really, its cause we are so far behind the release cycle. We're still 
>>> on HBase 0.20.3. I'm pretty sure much of our problems now would be relieved 
>>> by both getting caught up to ether CDHx or latest production Apache release.
>>
>> Online merge won't work with 0.20.3 anyways :)
>>
>>>
>>> Plus incorporating latest best practices in the design of the next version 
>>> to avoid these problems, using different EC2 instance types, disk system 
>>> layout, etc (I'll be posting some questions about this soon, would like to 
>>> have a discussion on such best practices for our class of HBase cluster).
>>
>> Cool.
>>
>>>
>>> Ok, Just to clarify since I muddied the water also asking about 
>>> hbase.hregion.max.filesize:
>>>
>>> If I increase the dfs.datanode.max.xcievers, can I do it on one machine at 
>>> a time and only have one datanode down at time ?
>>> Or do I need to bring the whole cluster down and update the 
>>> dfs.datanode.max.xcievers value and bring it back up?
>>> If I can do it a machine at a time, do I have to do it to the 
>>> namenode/master machine as well?
>>
>> You can roll restart DNs, NN doesn't need to be restarted
>>
>>>
>>> Ok, I'm not going to do that for this cluster... We have way too many 
>>> tables and its too scary :-)
>>
>> You could aim for the ones that grow the most.
>>
>>>
>>> I shouldn't have said rolling, I meant the idea of just manually doing the 
>>> update of the dfs.datanode.max.xcievers values and restarting one datanode 
>>> at a time.
>>> We can't use that cool graceful_shutdown option since we're on such an 
>>> ancient version of hbase. (another reason I'm itching to upgrade)
>>>
>>> But would the hbase rolling restart help, don't we really need to restart 
>>> the hdfs system for the dfs.datanode.max.xcievers  change to take place?
>>
>> Well if you want to set the max filesize by default for new tables
>> higher, you'll need to restart HBase. If not, then don't.
>>
>>>>
>>>> For that change to take effect on the new tables, I think only the
>>>> master would need to be bounced.
>>> I presume you are referring to the hbase.hregion.max.filesize changes 
>>> (which I'm not going to do right now) would just need the hbase master to 
>>> be bounced?
>>
>> Ya.
>
> __________________
> Robert J Berger - CTO
> Runa Inc.
> +1 408-838-8896
> http://blog.ibd.com
>
>
>
>

Re: dfs.datanode.max.xcievers > 4k

Reply via email to