Henry: Please also take a look at the following thread: http://search-hadoop.com/m/51M4jeDMyy1/GC+recommendations+for+large+Region+Server+heaps&subj=RE+GC+recommendations+for+large+Region+Server+heaps
On Thu, Apr 24, 2014 at 11:17 PM, Mikhail Antonov <[email protected]>wrote: > Henry, > > http://blog.ragozin.info/2011/10/java-cg-hotspots-cms-and-heap.html - that > may give some insights. > > -Mikhail > > > 2014-04-24 23:07 GMT-07:00 Henry Hung <[email protected]>: > > > Dear All, > > > > My current hbase environment is heavy write cluster with constant 2000+ > > insert rows / second spread to 10 region servers. > > Each day I also need to do data deletion, and that will add a lot of IO > to > > the cluster. > > > > The problem is sometimes after a week, one of the region server will > crash > > because > > 2014-04-10T10:17:47.200+0800: 1281486.956: [GC 1281486.956: [ParNew > > (promotion failed): 235959K->235959K(235968K), 0.0836790 > secs]1281487.040: > > [CMS2014-04-10T10:21:14.957+0800: 1281694.712: [CMS-concurrent-sweep: > > 267.111/279.155 secs] [Times: user=334.79 sys=14.38, real=279.11 secs] > > (concurrent mode failure): 13961950K->6802914K(16515072K), 209.9436660 > > secs] 14186496K->6802914K(16751040K), [CMS Perm : > 42864K->42859K(71816K)], > > 210.0274680 secs] [Times: user=210.18 sys=0.01, real=209.99 secs] > > > > I look into the gc log and usually find some information about CMS > > concurrent sweep that took a very long time to complete, such as: > > 2014-04-10T10:15:56.929+0800: 1281376.684: [CMS-concurrent-sweep: > > 48.834/58.027 secs] [Times: user=101.52 sys=11.82, real=58.02 secs] > > > > I do a lot of google-ing and already read the Todd Lipcon avoiding full > > GC, or other blogs that sometimes tells me how to set jvm flags such as > > this: > > -XX:+UseParNewGC > > -XX:CMSInitiatingOccupancyFraction=70 > > -Xmn256m > > -Xmx16384m > > -XX:+DisableExplicitGC > > -XX:+UseCompressedOops > > -XX:PermSize=160m > > -XX:MaxPermSize=160m > > -XX:GCTimeRatio=19 > > -XX:SoftRefLRUPolicyMSPerMB=0 > > -XX:SurvivorRatio=2 > > -XX:MaxTenuringThreshold=1 > > -XX:+UseFastAccessorMethods > > -XX:+UseParNewGC > > -XX:+UseConcMarkSweepGC > > -XX:+CMSParallelRemarkEnabled > > -XX:+UseCMSCompactAtFullCollection > > -XX:CMSFullGCsBeforeCompaction=0 > > -XX:+CMSClassUnloadingEnabled > > -XX:CMSMaxAbortablePrecleanTime=300 > > -XX:+CMSScavengeBeforeRemark > > > > But alas, the problem still exist. > > > > I also know that java 1.7 has a new G1GC that probably can be used to fix > > this problem, but I don't know if hbase 0.96 is ready to use it? > > > > I would really appreciate it if someone out there can share one or two > > things about jvm configuration to achieve a more stable region server. > > > > Best regards, > > Henry > > > > ________________________________ > > The privileged confidential information contained in this email is > > intended for use only by the addressees as indicated by the original > sender > > of this email. If you are not the addressee indicated in this email or > are > > not responsible for delivery of the email to such a person, please kindly > > reply to the sender indicating this fact and delete all copies of it from > > your computer and network server immediately. Your cooperation is highly > > appreciated. It is advised that any unauthorized use of confidential > > information of Winbond is strictly prohibited; and any information in > this > > email irrelevant to the official business of Winbond shall be deemed as > > neither given nor endorsed by Winbond. > > > > > > -- > Thanks, > Michael Antonov >
