We have to perform maintenance on one of our HDFS DataNode/HBase Regionserver
machines for a few hours. What are the right steps to take before doing the
maintenance in order to ensure limited impact to the cluster and (thrift)
clients of the cluster, both for HDFS and HBase?
After the
. After the maintenance, simply start the RS/DN back and it
will be added back to the cluster. Loadbalancer will then assign some
regions back to him. You will loose some data locality for the regions
wich are going to be moved.
JM
2013/4/25 Dan Crosta d...@magnetic.com:
We have to perform
mostly serve reads from HBase ?
Cheers
On Sun, Mar 17, 2013 at 1:56 PM, Dan Crosta d...@magnetic.com wrote:
Ah, thanks Ted -- I was wondering what that setting was for.
We are using CDH 4.2.0, which is HBase 0.94.2 (give or take a few
backports from 0.94.3).
Is there any harm
We occasionally get scanner timeout errors such as 66698ms passed since the
last invocation, timeout is currently set to 6 when iterating a scanner
through the Thrift API. Is there any reason not to raise the timeout to
something larger than the default 60s? Put another way, what resources
wrote:
Which HBase version are you using ?
In 0.94 and prior, the config param is hbase.regionserver.lease.period
In 0.95, it is different. See release notes of HBASE-6170
On Sun, Mar 17, 2013 at 11:46 AM, Dan Crosta d...@magnetic.com wrote:
We occasionally get scanner timeout errors
Fairly small -- row keys 32-48 bytes, column keys about the same, and values
50-100 bytes (with a few outliers that probably go up to 1k).
On Mar 3, 2013, at 6:08 AM, Varun Sharma wrote:
What is the size of your writes ?
On Sat, Mar 2, 2013 at 2:29 PM, Dan Crosta d...@magnetic.com wrote
On Mar 1, 2013, at 10:42 PM, lars hofhansl wrote:
What performance profile do you expect?
That's a good question. Our configuration is actually already exceeding our
minimum and desired performance thresholds, so I'm not too worried about it. My
concern is more that I develop an understanding
On Mar 1, 2013, at 10:53 PM, Ted Yu wrote:
bq. there is a also a parameter for controlling the queue side.
I guess Varun meant 'queue size'.
On http://hbase.apache.org/book.html, if you search for
'hbase.thrift.minWorkerThreads',
you would see 3 parameters. The last is
On Mar 2, 2013, at 12:38 PM, lars hofhansl wrote:
That's only true from the HDFS perspective, right? Any given region is
owned by 1 of the 6 regionservers at any given time, and writes are
buffered to memory before being persisted to HDFS, right?
Only if you disabled the WAL, otherwise
rowkey into the same Put object
thus when sending List of Put we made sure each Put has a unique rowkey.
On Saturday, March 2, 2013, Dan Crosta wrote:
On Mar 2, 2013, at 12:38 PM, lars hofhansl wrote:
That's only true from the HDFS perspective, right? Any given region is
owned by 1 of the 6
We are using a 6-node HBase cluster with a Thrift Server on each of the
RegionServer nodes, and trying to evaluate maximum write throughput for our use
case (which involves many processes sending mutateRowsTs commands). Somewhere
between about 30 and 40 processes writing into the system we
On Mar 1, 2013, at 9:13 AM, Asaf Mesika wrote:
Maybe you've the limit of the number of RPC Threads handling your write
requests per Region Server, thus your requests are queued?
Thanks -- I actually just stumbled across that setting in CDH manager, and am
experimenting with raising it.
The
distribution in HBase is the region, make
sure you have more than one. This is well documented in the manual
http://hbase.apache.org/book/perf.writing.html
J-D
On Fri, Mar 1, 2013 at 4:17 AM, Dan Crosta d...@magnetic.com wrote:
We are using a 6-node HBase cluster with a Thrift Server on each
13 matches
Mail list logo