Hi ,
We running production environment cluster (10 machines) :
1) hbase version 0.90.2
2) 2 tablse
3) we create ~ 15 regions per day (region size 250Mb)
I want to ask about major compaction best practices:
1) Have we to run it automatically or manually
2) How ofter it should run
3) Where
stack-3 wrote:
On Fri, May 13, 2011 at 7:44 AM, Stan Barton bartx...@gmail.com wrote:
stack-3 wrote:
On Thu, Apr 28, 2011 at 6:54 AM, Stan Barton bartx...@gmail.com wrote:
Are you swapping Stan? You are close to the edge with your RAM
allocations. What do you have swappyness set to?
For starters, take a look at this...
http://hbase.apache.org/book.html#perf.configurations
-Original Message-
From: Oleg Ruchovets [mailto:oruchov...@gmail.com]
Sent: Monday, May 16, 2011 6:42 AM
To: user@hbase.apache.org
Subject: major compaction best practice
Hi ,
We running
OK, I must be doing something wrong. This will be the death of me if I
don't pass my scalability testing on Wednesday for my project to get
approved.
Running on version 0.90.1-cdh3u0 using the pseudo-distributed mode
for Hadoop and Hbase. ZK mode is standalone.
How can I tell if Hbase is
From hbase-default.xml:
If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which we will start/stop ZooKeeper on.
Normally I would let client use the same hbase-site.xml as what server uses.
After increasing maxClientCnxns, do you observe the same problem ?
Cheers
On
It was not set in hbase-env.sh.
The errors now seem to be gone.
Thanks for your prompt attention after my cry for help.
On Mon, May 16, 2011 at 9:01 AM, Ted Yu yuzhih...@gmail.com wrote:
From hbase-default.xml:
If HBASE_MANAGES_ZK is set in hbase-env.sh
this is the list of servers which
Hi everyone,
I've just installing hadoop and hbase from cloudera (3) but when I try to go
to http://localhost:60010 it just sits there continually loading.
I can get to the regionserver fine - http://localhost:60030... Looking at
the master hbase server logs I can see the following log output
It's currently bad in general.
-Original Message-
From: Lars Egarots [mailto:lars.egar...@yahoo.com]
Sent: Monday, May 16, 2011 12:36 PM
To: user@hbase.apache.org
Subject: number of column families
The user documentation, in the Apache HBase book, states: HBase currently does
not do
On Sun, May 15, 2011 at 6:10 AM, Nightie Wolfi nightwolf...@gmail.com wrote:
org.apache.hadoop.hbase.Chore.run(Chore.java:66) Caused by:
java.net.ConnectException: Connection refused at
sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at
On Mon, May 16, 2011 at 10:33 AM, Himanish Kushary himan...@gmail.com wrote:
Hi,
We are in the process of moving a small Hbase/Hadoop cluster from our
development to production environment.Our development environment were few
intel desktops (8 cores CPU/8 Gigs RAM/7200 rpm disks) running
See the rest of my email.
St.Ack
On Mon, May 16, 2011 at 8:18 AM, Robert Gonzalez
robert.gonza...@maxpointinteractive.com wrote:
0.90.0
-Original Message-
From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack
Sent: Friday, May 13, 2011 2:21 PM
To:
2011/5/16 Frédéric Fondement frederic.fondem...@uha.fr:
Hi all,
Simple question: is it correct practice to save data in a column family
qualifier ?
Others use the qualifier to carry data. There is no rule against it.
St.Ack
On Mon, May 16, 2011 at 3:42 AM, Oleg Ruchovets oruchov...@gmail.com wrote:
I want to ask about major compaction best practices:
1) Have we to run it automatically or manually
Major compaction runs once a day by default. It has a tendency
whereby it will start just when you do not want it to
Is there any reason why running an HDFS balancer on the filesystem
used for HBase would be considered bad practice? Doesn't seem so to me
at face value but I wanted to be sure it seemed sane before enabling
it.
Thanks,
-erik
It would move blocks that are used by the local region servers,
messing up your block locality. That the first reason I can think of.
J-D
On Mon, May 16, 2011 at 11:14 AM, Erik Onnen eon...@gmail.com wrote:
Is there any reason why running an HDFS balancer on the filesystem
used for HBase would
Should be fine. Don't run it at a high rate or the network traffic
will drag on your hbase serving.
St.Ack
On Mon, May 16, 2011 at 11:14 AM, Erik Onnen eon...@gmail.com wrote:
Is there any reason why running an HDFS balancer on the filesystem
used for HBase would be considered bad practice?
We're only at about .4 network capacity during peak load so I don't
think we'll cause network issues. Disk I/O may be another story but
network will be fine I suspect.
On Mon, May 16, 2011 at 11:16 AM, Stack st...@duboce.net wrote:
Should be fine. Don't run it at a high rate or the network
You are giving us the mile high overview of the problem, pointing to a
specific culprit could be very time consuming. Instead, can you run
some system tests and make sure things work the way they should? Are
the disks strangely slow? Any switches acting up?
Regarding your CPUs, counting is mostly
Hi,
I have two questions:
1. Does HBase knows how to handle blocks moving.
e.g does HBase can recognize that some local block deleted from machine and
move that region to machine with that block?
2. What happen if the region server of the .META. failed? Does HBase has
duplicate region for that?
Hi,
I have two questions:
1. Does HBase knows how to handle blocks moving.
e.g does HBase can recognize that some local block deleted from machine and
move that region to machine with that block?
No, transparent.
2. What happen if the region server of the .META. failed? Does HBase has
On Mon, May 16, 2011 at 4:55 AM, Stan Barton bartx...@gmail.com wrote:
Sorry. How do you enable overcommitment of memory, or do you mean to
say that your processes add up to more than the RAM you have?
The memory overcommitment is needed because in order to let java still
allocate the
If you have a high insert rate then maybe log rolling (which blocks
inserts a little) makes it that the calls get queued enough (occupying
heap) to make you enter a GC loop of death? Can you enable RPC logging
and see if you can confirm that?
Thx,
J-D
On Sun, May 15, 2011 at 5:37 PM, Jack Levin
Hey Dmitriy,
Awesome you could figure it out. I wonder if there's something that
could be done in HBase to help debugging such problems... Suggestions?
Also, just to make sure, this thread was started by Sean and it seems
you stepped up for him... you are working together right? At least
that's
On Sun, May 15, 2011 at 5:37 PM, Jack Levin magn...@gmail.com wrote:
I've added occupancy: export HBASE_OPTS=$HBASE_OPTS -verbose:gc
-XX:CMSInitiatingOccupancyFraction=70 -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+HeapDumpOnOutOfMemoryError
-Xloggc:$HBASE_HOME/logs/gc-hbase.log
Does
Would you be able to patch in
https://issues.apache.org/jira/browse/HBASE-3695 and see what hbck
tells you now? Else you could try using the 0.90.3 rc0 which has it
too: http://people.apache.org/~stack/hbase-0.90.3-candidate-0/
J-D
On Sun, May 15, 2011 at 9:11 AM, Andy Sautins
Thanks for the reply. We ran the TestDFSIO benchmark on both the development
and production and found the production to be better.The statistics are
shown below.
But once we bring HBase into the picture things gets reversed :-(
The count operation,map-reduces etc becomes less performing on the
I doesn't look like you are doing something wrong, also I looked at
the unit tests and they seem to cover the basic usage of
ColumnPaginationFilter. Can you try removing the addFamily and
setMaxVersions to see if it has any effect?
Thx,
J-D
On Fri, May 13, 2011 at 6:27 PM, Matthew Ward
Ok I see... so the only thing that changed is the HW right? No
upgrades to a new version? Also could it be possible that you changed
some configs (or missed them)? BTW counting has a parameter for
scanner caching, like you would write: count myTable, CACHE = 1000
and it should stream through your
When we change versions to 1 from 3 on hbase table schema, things
appear work right.
-Jack
On Mon, May 16, 2011 at 12:14 PM, Jean-Daniel Cryans
jdcry...@apache.org wrote:
I doesn't look like you are doing something wrong, also I looked at
the unit tests and they seem to cover the basic usage
We had issues of moving into 32 core AMD box also. The issue was
revolving around datanode getting slow after about 12 hours. What you
need to do is check fsreadlatency_ave_time graph, if it appears spiky
then you have a problem with IO, next get a graph of Runnable
Threads they should be
Dima and I work together. He's got a good amount of opensource
experience on me and I got pulled away to work on something
else(MS-SQL issues, no less). He gets all the fun. :). Seriously,
the issue wouldn't have been solved without him stepping up. thx
Dima!.
sean
On Mon, May 16, 2011 at
Yes, it is only the HW that was changed . All the configurations are kept at
default from the cloudera installer.
The regionserver logs semms ok.
On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:
Ok I see... so the only thing that changed is the HW right? No
What is the clock rate of your CPUs (desktop vs blade)?
-Jack
On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary himan...@gmail.com wrote:
Yes, it is only the HW that was changed . All the configurations are kept at
default from the cloudera installer.
The regionserver logs semms ok.
On
How do you tell? This is the log entries when we had 100% cpu:
2011-05-14T15:48:58.240-0700: 5128.407: [GC 5128.407: [ParNew:
17723K-780K(19136K), 0.0199350 secs] 4309804K-4292973K(5777060K),
0.0200660 secs] [Times: user=0.07 sys=0.00, real=0.02 secs]
2011-05-14T15:48:58.349-0700: 5128.515: [GC
*PRODUCTION SERVER CPU INFO*
processor : 0
vendor_id : AuthenticAMD
cpu family : 16
model : 9
model name : AMD Opteron(tm) Processor 6174
stepping : 1
cpu MHz : 2200.022
cache size : 512 KB
physical id : 1
siblings : 12
core id : 0
cpu cores : 12
apicid : 16
fpu : yes
fpu_exception : yes
cpuid
Thanks J-D
Using hbase-0.20.6, 49 node cluster
The map reduce job involve a full table scan...(region size 4 gig)
The job runs great for 1 week..
Starts failing after 1 week of data accumulation (about 3000 regions)
About 400 regions get created per day...
Can you suggest any tunables at the
I think this will resolve my issue, here is the output:
14 2011-05-16T15:58
13 2011-05-16T15:59
12 2011-05-16T16:00
14 2011-05-16T16:01
14 2011-05-16T16:02
13 2011-05-16T16:03
11 2011-05-16T16:04
12 2011-05-16T16:05
11 2011-05-16T16:06
16:06:55
So, the change is that you started using CMS? You were using the
default GC previous?
ParNew is much bigger now.
St.Ack
On Mon, May 16, 2011 at 4:11 PM, Jack Levin magn...@gmail.com wrote:
I think this will resolve my issue, here is the output:
14 2011-05-16T15:58
13
Those are the lines I added:
-XX:+CMSIncrementalMode \
-XX:+CMSIncrementalPacing \ ---
-XX:-TraceClassUnloading --
-Jack
(used CMS before)
On Mon, May 16, 2011 at 4:19 PM, Stack st...@duboce.net wrote:
So, the change is that you started using CMS? You were using the
default
This is interesting because our conventional wisdom is those settings should
increase the chance of stop-the-world GC and should be avoided.
- Andy
(who always gets nervous when we start talking about GC black magic)
From: Jack Levin magn...@gmail.com
Subject: Re: GC and High CPU
To:
I think in our case we have a deadlock of not cleaning garbage in large
enough chunks; being stuck in high cpu is as good as being dead
Jack
On May 16, 2011 4:41 PM, Andrew Purtell apurt...@apache.org wrote:
This is interesting because our conventional wisdom is those settings
should increase
I don't understand what of the below made a difference though the
difference is plain from the GC logs you show.
See below:
On Mon, May 16, 2011 at 5:06 PM, Jack Levin magn...@gmail.com wrote:
Those are the lines I added:
-XX:+CMSIncrementalMode \
From the doc., it says about
This is the way I read it. Low processors == high CPU tasks, e.g. high
load. So, Incremental takes GC down a number of notches when it comes to
competing with CPU for APP threads. That being the case the deadlock is
less likely. It would be useful to add code to the RS that will start
blocking
43 matches
Mail list logo