Can we see those GC logs at the time of the pause? (plus lines of
context around that time)
J-D
On Thu, Mar 10, 2011 at 8:58 PM, M.Deniz OKTAR deniz.ok...@gmail.com wrote:
Thats the weird thing,
Region is still alive. Just paused for a while and I don't know what are the
causes of those long
That looks like someone trying to connect to the master but they are
not doing the handshake properly. Do you have old versions of hbase
around the place? Or some other process connection the HBase Master?
As to unresponsive for 100 seconds, what was going on on your cluster?
Any clues in
iletken-test-2 died.
J-D
2011/3/10 M.Deniz OKTAR deniz.ok...@gmail.com:
Hi,
Still working on the issue. This is one of the last trials I am doing before
ordering a new cluster.
I was going through yahoo benchmark again and hbase became non responsive
for a long time, (about 100 secs)
Thats the weird thing,
Region is still alive. Just paused for a while and I don't know what are the
causes of those long pauses. Checked the garbage collector logs, nothing was
taking too long.
I'm suspecting hardware.
--
Deniz
2011/3/11 Jean-Daniel Cryans jdcry...@apache.org
iletken-test-2
I don't know if it's related but i've seen a dead regionserver a little
while ago too. But in our case .out file showed some JVM crash along with a
hs_err dump in hbase home (attached below). We were running 0.90.0 at the
moment and we upgraded to 0.90.1 in hopes of a fix but nothing changed.
The
BTW our last line in the .log is completely different. If it's not related
anyhow just say and i'll stop stealing the thread.
---More eviction start-complete---
# 2011-03-01 15:52:38,365 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction
started; Attempting to free
This is a JVM error, and there seems to be a lot of them in the recent
versions. I personally recommend using u16 or u17.
J-D
On Wed, Mar 9, 2011 at 1:01 AM, Erdem Agaoglu erdem.agao...@gmail.com wrote:
I don't know if it's related but i've seen a dead regionserver a little
while ago too. But
Hi all,
Thanks for the support. I'v been trying to replicate the problem since this
morning. Before doing that, played with the configuration. I used to have
only one user and set all the permissions according to that. Now I'v
followed the cloudera manuals and set permissions for hdfs and mapred
Htable had disabled when ctr+c ?
2011/3/8, M.Deniz OKTAR deniz.ok...@gmail.com:
Something new came up!
I tried to truncate the 'usertable' which had ~12M entries.
Shell stayed at disabling table for a long time. The processes was there
but there were no requests. So I quit the state by
Hi everyone,
We are having this problem for a while and would really appreciate any
suggestions.
We have a 5 node cluster, 4 of them being region servers. I am running a
custom workload with YCSB and when the data is loading (heavy insert) at
least one of the region servers are dying after about
On Mon, Mar 7, 2011 at 5:43 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote:
We have a 5 node cluster, 4 of them being region servers. I am running a
custom workload with YCSB and when the data is loading (heavy insert) at
least one of the region servers are dying after about 60 operations.
Hi,
Thanks for the effort, answers below:
On Mon, Mar 7, 2011 at 6:08 PM, Stack st...@duboce.net wrote:
On Mon, Mar 7, 2011 at 5:43 AM, M.Deniz OKTAR deniz.ok...@gmail.com
wrote:
We have a 5 node cluster, 4 of them being region servers. I am running a
custom workload with YCSB and when
I don't know if its normal but I see alot of '0's in the test results when
it tends to fail, such as:
1196 sec: 7394901 operations; 0 current ops/sec;
--
deniz
On Mon, Mar 7, 2011 at 6:46 PM, M.Deniz OKTAR deniz.ok...@gmail.com wrote:
Hi,
Thanks for the effort, answers below:
On Mon,
I'm stumped. I have nothing to go on when no death throes or
complaints. This hardware for sure is healthy? Other stuff runs w/o
issue?
St.Ack
On Mon, Mar 7, 2011 at 8:48 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote:
I don't know if its normal but I see alot of '0's in the test results when
I run every kind of benchmark I could find on those machines and they seemed
to work fine. Did memory/disk tests too.
The master node or other nodes provide some information and exceptions about
that they can't reach to the dead node.
Btw sometimes the process does not die but looses the
Along with a bigger portion of the log, it be might good to check if
there's anything in the .out file that looks like a jvm error.
J-D
On Mon, Mar 7, 2011 at 9:22 AM, M.Deniz OKTAR deniz.ok...@gmail.com wrote:
I run every kind of benchmark I could find on those machines and they seemed
to
16 matches
Mail list logo