Re: region servers dying - flush request - YCSB

2011-03-11 Thread Jean-Daniel Cryans
Can we see those GC logs at the time of the pause? (plus lines of context around that time) J-D On Thu, Mar 10, 2011 at 8:58 PM, M.Deniz OKTAR deniz.ok...@gmail.com wrote: Thats the weird thing, Region is still alive. Just paused for a while and I don't know what are the causes of those long

Re: region servers dying - flush request - YCSB

2011-03-10 Thread Stack
That looks like someone trying to connect to the master but they are not doing the handshake properly. Do you have old versions of hbase around the place? Or some other process connection the HBase Master? As to unresponsive for 100 seconds, what was going on on your cluster? Any clues in

Re: region servers dying - flush request - YCSB

2011-03-10 Thread Jean-Daniel Cryans
iletken-test-2 died. J-D 2011/3/10 M.Deniz OKTAR deniz.ok...@gmail.com: Hi, Still working on the issue. This is one of the last trials I am doing before ordering a new cluster. I was going through yahoo benchmark again and hbase became non responsive for a long time, (about 100 secs)

Re: region servers dying - flush request - YCSB

2011-03-10 Thread M.Deniz OKTAR
Thats the weird thing, Region is still alive. Just paused for a while and I don't know what are the causes of those long pauses. Checked the garbage collector logs, nothing was taking too long. I'm suspecting hardware. -- Deniz 2011/3/11 Jean-Daniel Cryans jdcry...@apache.org iletken-test-2

Re: region servers dying - flush request - YCSB

2011-03-09 Thread Erdem Agaoglu
I don't know if it's related but i've seen a dead regionserver a little while ago too. But in our case .out file showed some JVM crash along with a hs_err dump in hbase home (attached below). We were running 0.90.0 at the moment and we upgraded to 0.90.1 in hopes of a fix but nothing changed. The

Re: region servers dying - flush request - YCSB

2011-03-09 Thread Erdem Agaoglu
BTW our last line in the .log is completely different. If it's not related anyhow just say and i'll stop stealing the thread. ---More eviction start-complete--- # 2011-03-01 15:52:38,365 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free

Re: region servers dying - flush request - YCSB

2011-03-09 Thread Jean-Daniel Cryans
This is a JVM error, and there seems to be a lot of them in the recent versions. I personally recommend using u16 or u17. J-D On Wed, Mar 9, 2011 at 1:01 AM, Erdem Agaoglu erdem.agao...@gmail.com wrote: I don't know if it's related but i've seen a dead regionserver a little while ago too. But

Re: region servers dying - flush request - YCSB

2011-03-08 Thread M.Deniz OKTAR
Hi all, Thanks for the support. I'v been trying to replicate the problem since this morning. Before doing that, played with the configuration. I used to have only one user and set all the permissions according to that. Now I'v followed the cloudera manuals and set permissions for hdfs and mapred

Re: region servers dying - flush request - YCSB

2011-03-08 Thread 陈加俊
Htable had disabled when ctr+c ? 2011/3/8, M.Deniz OKTAR deniz.ok...@gmail.com: Something new came up! I tried to truncate the 'usertable' which had ~12M entries. Shell stayed at disabling table for a long time. The processes was there but there were no requests. So I quit the state by

region servers dying - flush request - YCSB

2011-03-07 Thread M.Deniz OKTAR
Hi everyone, We are having this problem for a while and would really appreciate any suggestions. We have a 5 node cluster, 4 of them being region servers. I am running a custom workload with YCSB and when the data is loading (heavy insert) at least one of the region servers are dying after about

Re: region servers dying - flush request - YCSB

2011-03-07 Thread Stack
On Mon, Mar 7, 2011 at 5:43 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote: We have a 5 node cluster, 4 of them being region servers. I am running a custom workload with YCSB and when the data is loading (heavy insert) at least one of the region servers are dying after about 60 operations.

Re: region servers dying - flush request - YCSB

2011-03-07 Thread M.Deniz OKTAR
Hi, Thanks for the effort, answers below: On Mon, Mar 7, 2011 at 6:08 PM, Stack st...@duboce.net wrote: On Mon, Mar 7, 2011 at 5:43 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote: We have a 5 node cluster, 4 of them being region servers. I am running a custom workload with YCSB and when

Re: region servers dying - flush request - YCSB

2011-03-07 Thread M.Deniz OKTAR
I don't know if its normal but I see alot of '0's in the test results when it tends to fail, such as: 1196 sec: 7394901 operations; 0 current ops/sec; -- deniz On Mon, Mar 7, 2011 at 6:46 PM, M.Deniz OKTAR deniz.ok...@gmail.com wrote: Hi, Thanks for the effort, answers below: On Mon,

Re: region servers dying - flush request - YCSB

2011-03-07 Thread Stack
I'm stumped. I have nothing to go on when no death throes or complaints. This hardware for sure is healthy? Other stuff runs w/o issue? St.Ack On Mon, Mar 7, 2011 at 8:48 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote: I don't know if its normal but I see alot of '0's in the test results when

Re: region servers dying - flush request - YCSB

2011-03-07 Thread M.Deniz OKTAR
I run every kind of benchmark I could find on those machines and they seemed to work fine. Did memory/disk tests too. The master node or other nodes provide some information and exceptions about that they can't reach to the dead node. Btw sometimes the process does not die but looses the

Re: region servers dying - flush request - YCSB

2011-03-07 Thread Jean-Daniel Cryans
Along with a bigger portion of the log, it be might good to check if there's anything in the .out file that looks like a jvm error. J-D On Mon, Mar 7, 2011 at 9:22 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote: I run every kind of benchmark I could find on those machines and they seemed to