big compaction queue size

2011-09-06 Thread Xu-Feng Mao
Hi, We're running a 33-regionserver hbase cluster on top of cdh3u0 suites. On average, we have 2400 regions hosted on each regionserver. (hbase.hregion.max.filesize is 1.5GB, and we have value size up to 4MB per object). I check the log of regionserver, it seems like the compaction queue size is

The number of fd and CLOSE_WAIT keep increasing.

2011-08-22 Thread Xu-Feng Mao
Hi, We are running cdh3u0 hbase/hadoop suites on 28 nodes. From last Friday, we got three regionservers have opened fd and CLOSE_WAIT kept increasing. It looks like if the lines like 2011-08-22 18:19:01,815 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region

Re: The number of fd and CLOSE_WAIT keep increasing.

2011-08-22 Thread Xu-Feng Mao
: Bug Author: Bharath Mundlapudi Ref: CDH-3200 Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) - Original Message - From: Xu-Feng Mao m9s...@gmail.com To: hbase-u...@hadoop.apache.org; user

Re: The number of fd and CLOSE_WAIT keep increasing.

2011-08-22 Thread Xu-Feng Mao
in updating between dot versions cause of compatibility being maintained :) On 23-Aug-2011, at 10:26 AM, Xu-Feng Mao wrote: Thanks Andy! cdh3u1 is based on hbase 0.90.3, which has some nice admin scripts, like graceful_stop.sh. Is it easy to upgrade hbase from cdh3u0 to cdh3u1? I guess we

HMaster jvm crashes and imbalance cluster

2011-07-20 Thread Xu-Feng Mao
Hi, We're running a 25-node regionserver hbase cluster, using cdh3u0. 1. We run into several jvm crashes on master today. It seems like jvm issues, as I attached the hs_error_pid files with this message. Just want to confirm that if this is really a jvm issue, or maybe some master issue trigger

Upgrade from cdh3u0 to 0.90.3 + HBASE-3872

2011-07-08 Thread Xu-Feng Mao
Hi, Since we've run into HBASE-3872 issue, we're considering upgrade a production system from cdh3u0 to 0.90.3+HBASE-3872 patched. Is it safe to just replace the hbase directory, and restart all the regionservers and master? We have no chance to stop the whole cluster together, can we restarted

Re: Upgrade from cdh3u0 to 0.90.3 + HBASE-3872

2011-07-08 Thread Xu-Feng Mao
in the logs? On Sat, Jul 9, 2011 at 10:54 AM, Ted Yu yuzhih...@gmail.com wrote: Please direct question related to cdh to cdh-dev Patch of HBASE-3872 for 0.90 branch isn't posted yet. Do you have a patch already ? On Fri, Jul 8, 2011 at 7:43 PM, Xu-Feng Mao m9s...@gmail.com wrote: Hi, Since

Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
a inconsistent scenario, how can I recover this problem? Thanks and regards, Mao Xu-Feng -- Forwarded message -- From: Xu-Feng Mao m9s...@gmail.com Date: Wed, Jul 6, 2011 at 7:20 AM Subject: Re: WrongRegionException and inconsistent table found To: Xu-Feng Mao m9s...@gmail.com Cc: user

Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
, Xu-Feng Mao m9s...@gmail.com wrote: Hi, I looks like we've lost a region, include the directory on hdfs and its meta record as well. We need some more time to dig into the log sea, to figure out the root cause. But first of all, we need to recover the meta, so that we can put keys

Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
out those 9 regions(we'll go through the .META.)? Thanks and regards, Mao Xu-Feng On Thu, Jul 7, 2011 at 3:21 AM, Stack st...@duboce.net wrote: On Wed, Jul 6, 2011 at 5:37 AM, Xu-Feng Mao m9s...@gmail.com wrote: I looks like we've lost a region, include the directory on hdfs and its meta

Re: Possible solution to 'WrongRegionException and inconsistent table found'

2011-07-06 Thread Xu-Feng Mao
Thanks Stack, I embed my reply in italic. On Thu, Jul 7, 2011 at 12:19 PM, Stack st...@duboce.net wrote: On Wed, Jul 6, 2011 at 7:28 PM, Xu-Feng Mao m9s...@gmail.com wrote: Regarding 'is multiply assigned to region servers' I found these messages after running add_table.rb, and assign them

WrongRegionException and inconsistent table found

2011-07-05 Thread Xu-Feng Mao
Hi, We're running a hbase cluster including 37 regionservers. Today, we found losts of WrongRegionException when putting object into it. hbase hbck -details reports that Chain of regions in table STable is broken; edges does not contain ztxrGmCwn-6BE32s3cX1TNeHU_I= ERROR: Found

Re: WrongRegionException and inconsistent table found

2011-07-05 Thread Xu-Feng Mao
We also check the master log, nothing interesting found. On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao m9s...@gmail.com wrote: Hi, We're running a hbase cluster including 37 regionservers. Today, we found losts of WrongRegionException when putting object into it. hbase hbck -details

Re: WrongRegionException and inconsistent table found

2011-07-05 Thread Xu-Feng Mao
I forgot the version, we are using cdh3u0. Mao Xu-Feng 在 2011-7-6,0:59,Xu-Feng Mao m9s...@gmail.com 写道: We also check the master log, nothing interesting found. On Wed, Jul 6, 2011 at 12:58 AM, Xu-Feng Mao m9s...@gmail.com wrote: Hi, We're running a hbase cluster including 37 regionservers