Unless you have weird OS settings, everything should be fine. Like I said in my first email, make sure you don't max out your number of processes (it's a ulimit config).
J-D On Tue, Sep 27, 2011 at 11:41 PM, Robert J Berger <[email protected]> wrote: > Thanks again for the information. We're implementing it now. > > Just one last question (at least for a bit :-) > > If we bump up our dfs.datanode.max.xcievers from 4k to 8k what should we > watch for in terms of exhausting any system resources? > > We have the Heap Sizes set to: > > * DataNode -Xmx2000m > * TaskTracker -Xmx2000m > * RegionServer to -Xmx4000m > * m1.xlarge EC2 Instances with 14GB of RAM. > > I'm thinking about removing the TaskTrackers and use non RegionServer based > instances for running just TaskTrackers when we need to do Map/Reduce. > > Just wondering what I should be monitoring or tweaking since the Datanode > could be doubling the number of Threads its running... > > Thanks! > Rob > > On Sep 27, 2011, at 12:50 PM, Jean-Daniel Cryans wrote: > >> On Tue, Sep 27, 2011 at 12:31 PM, Robert J Berger <[email protected]> wrote: >>> Its not enough. We're still having errors and it caused a regionserver to >>> shutdown again. No data loss but degraded service (Yay for robustness!) >> >> Yeah, just up those xcievers. >> >>>> >>> I tend to be "conservative" (was going to say cowardly) towards our HBase >>> cluster since its the persistent core of our application. So I'm going to >>> not worry about growing the hfile size on this system. >>>> >>> Not really, its cause we are so far behind the release cycle. We're still >>> on HBase 0.20.3. I'm pretty sure much of our problems now would be relieved >>> by both getting caught up to ether CDHx or latest production Apache release. >> >> Online merge won't work with 0.20.3 anyways :) >> >>> >>> Plus incorporating latest best practices in the design of the next version >>> to avoid these problems, using different EC2 instance types, disk system >>> layout, etc (I'll be posting some questions about this soon, would like to >>> have a discussion on such best practices for our class of HBase cluster). >> >> Cool. >> >>> >>> Ok, Just to clarify since I muddied the water also asking about >>> hbase.hregion.max.filesize: >>> >>> If I increase the dfs.datanode.max.xcievers, can I do it on one machine at >>> a time and only have one datanode down at time ? >>> Or do I need to bring the whole cluster down and update the >>> dfs.datanode.max.xcievers value and bring it back up? >>> If I can do it a machine at a time, do I have to do it to the >>> namenode/master machine as well? >> >> You can roll restart DNs, NN doesn't need to be restarted >> >>> >>> Ok, I'm not going to do that for this cluster... We have way too many >>> tables and its too scary :-) >> >> You could aim for the ones that grow the most. >> >>> >>> I shouldn't have said rolling, I meant the idea of just manually doing the >>> update of the dfs.datanode.max.xcievers values and restarting one datanode >>> at a time. >>> We can't use that cool graceful_shutdown option since we're on such an >>> ancient version of hbase. (another reason I'm itching to upgrade) >>> >>> But would the hbase rolling restart help, don't we really need to restart >>> the hdfs system for the dfs.datanode.max.xcievers change to take place? >> >> Well if you want to set the max filesize by default for new tables >> higher, you'll need to restart HBase. If not, then don't. >> >>>> >>>> For that change to take effect on the new tables, I think only the >>>> master would need to be bounced. >>> I presume you are referring to the hbase.hregion.max.filesize changes >>> (which I'm not going to do right now) would just need the hbase master to >>> be bounced? >> >> Ya. > > __________________ > Robert J Berger - CTO > Runa Inc. > +1 408-838-8896 > http://blog.ibd.com > > > >
