what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread tgh
Hi I try to use hbase 0.90 to store 100billion massage, and I have setup hbase, and use API to store messages into hbase, It seems ok, But, my mates tell me that , for hbase , max size for one region is 4GB , and for one server, the max number for region is 100, then for one

Re: what to do if I have changed the ip and hostname of an existed hbase cluster?

2012-12-17 Thread Azuryy Yu
hi, I solved this problem last week. stop-hbase - change zkdata property directory(same as delete zk data) - change ip, hostname - start-hbase all resions will be reassigned. if you have lots of regions, it's very very slow. I think there is no answer for this slow process until 0.96. On Mon,

Re: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread Nicolas Liochon
This should help: http://hbase.apache.org/book/important_configurations.html#bigger.regions On Mon, Dec 17, 2012 at 9:11 AM, tgh guanhua.t...@ia.ac.cn wrote: Or what about the max size for one region and what about the max size of region for one server?

答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread tgh
Thank you for your reply, and I visit the webpage, it is helpful, And following it, I can use 500 region in ONE server, is it? And then if I use 500 region in ONE server, and one region is 40GB, and one server will store 20TB, it is ok , is it? Thank you - Tian Guanhua

Re: Problems with HBase JMX beans

2012-12-17 Thread Nicolas Liochon
HBASE-5718 seems to say it's reproducible only on openjdk. HBase requires the jdk from Oracle (see http://hbase.apache.org/book.html#basic.prerequisites). Issues that occur on other jdk are not rejected, but usually receives a lower priority. If someone provides a patch, it will be integrated.

Re: Problems with HBase JMX beans

2012-12-17 Thread Ivan Ryndin
Hi Nicolas, thank you for the answer! Yep, you are right, this is the problem with OpenJDK. System is: Centos 64bit, java-1.6.0-openjdk-1.6.0.0.x86_64, HBase 0.94.3, java version 1.6.0_24 Ok, I agree that the problem is not so critical. Thank you again! -- Best regards, Ivan P. Ryndin

答复: 答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread tgh
Thank you for your reply , but I write for I want to make sure, if the number of region in ONE server exceed 300 or 500, the hbase will fail or something, or what is the max number of region for ONE server? And I use hbase 0.90, Could you help me Thank you Tian Guanhua

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread Nicolas Liochon
I think it's safer to use a newer version (0.94): there are a lot of things around performances volumes in the 0.92 0.94. As well, there are much more bug fixes releases on the 0.94. For the number of region, there is no maximum written in stone. Having too many regions will essentially impact

Re: MR missing lines

2012-12-17 Thread Jean-Marc Spaggiari
The job run the morning, and of course, this time, all the rows got processed ;) So I will give it few other tries and will keep you posted if I'm able to reproduce that again. Thanks, JM 2012/12/16, Jean-Marc Spaggiari jean-m...@spaggiari.org: Thanks for the suggestions. I already have

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread Doug Meil
Hi there, When sizing your data, don't forget to read thisŠ http://hbase.apache.org/book.html#schema.creation and http://hbase.apache.org/book.html#regions.arch 9.7.5.4. KeyValue You need to understand how Hbase stores data internally on initial design to avoid problems down the line. Keep

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread Bryan Beaudreault
0.90.x supports up to 4GB region sizes max, not 40. You would need to upgrade to 0.92.x at least to go higher than that. Sent from iPhone. On Dec 17, 2012, at 9:31 AM, Doug Meil doug.m...@explorysmedical.com wrote: Hi there, When sizing your data, don't forget to read thisŠ

Re: HBaseClient.call() hang

2012-12-17 Thread Bryan Keller
It seems there was a cascading effect. The regionservers were busy with scanning a table, which resulted in some long GC's. The GC's were long enough to trigger the Zookeeper timeout on at least one regionserver, which resulted in the regionserver shutting itself down. This then caused the

Re: what to do if I have changed the ip and hostname of an existed hbase cluster?

2012-12-17 Thread 周梦想
thank you Azuryy, I have encountered an problem of hbase, but I don't know whether it was caused by ip change or not. because we started the system for about 2 hours, then second name server exited all cluster can't bring up...,hbase data is corrupt it's a long miserable story...

Re: merge local hbase data with production hbase

2012-12-17 Thread 周梦想
hi tousif, maybe you can first export the data to hdfs,and import the data to your production hbase table. or write some code to do this, or just using hive to complete this! best regards! andy 2012/12/17 Tousif tousif.pa...@gmail.com Hi, Can anyone help me identify a tool or best method to

RE: Coprocessors and Zookeeper sessions

2012-12-17 Thread Aaron Tokhy
You didn't say which HBase version. I'm assuming 0.94. HBase 0.94.1 and higher. Looks very simple from the region server side, thanks for the tip. As for the client invoking the endpoint based coprocessor, I would like to initially set these values so that the region servers would decrement

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread Andrew Purtell
Don't use HBase 0.90. Our current release is 0.94. You will find the community is able to help you much more satisfactorily if you start with the current release. On Mon, Dec 17, 2012 at 2:26 AM, tgh guanhua.t...@ia.ac.cn wrote: Thank you for your reply , but I write for I want to make sure,

Re: 答复: 答复: what is the max size for one region and what is the max size of region for one server

2012-12-17 Thread lars hofhansl
Here's some back of the envelope math: Say you have 6 1T drives per machines. That gives you about 2T of usable space (considering HDFS 3-way replication). A reasonable max size for regions is 20gb. That's 100 regions for 2T. If you set the flushsize to 128mb, you'd need ~13gb RAM in the worst

Re: Wrong input split locations after enabling reverse DNS

2012-12-17 Thread Stack
On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer psyb...@gmail.com wrote: I recently enabled reverse DNS on my test cluster. Now when I run a MR job, the HBase input split locations are all adding a period to the end. For example: /default-rack/foo-1. /default-rack/foo-2. Yet the machine

Roll of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
Hi All, I am trying to figure out the exact roll of hbase.tmp.dir in HBase but i could not find any detailed reference on HBase wiki and mailing list archives. Can anybody tell me for which purpose hbase.tmp.dir is used? Is it a comma separated value that can take multiple directories? Any

Re: Roll of hbase.tmp.dir in HBase

2012-12-17 Thread Nick Dimiduk
This directory is used by the RegionServers durring compactions to store intermediate data. See: $ git grep 'hbase.tmp.dir' hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java: private final static String CONF_TMP_DIR = hbase.tmp.dir;

Re: Roll of hbase.tmp.dir in HBase

2012-12-17 Thread Stack
In refguide we repeat content of hbase-default.xml: http://hbase.apache.org/book.html#hbase.tmp.dir What Nick said plus its used to keep all data when doing standlone hbase. We should amend doc. to say that you cannot do comma-delimited list? St.Ack On Mon, Dec 17, 2012 at 3:19 PM, anil gupta

Re: HBaseClient.call() hang

2012-12-17 Thread Azuryy Yu
Don't increase RS timeout to avoid this issue. what size of your block size? and can you paste your JVM options here? I also met a long GC problem, but I tuned jvm options, it works very well now. On Tue, Dec 18, 2012 at 1:18 AM, Bryan Keller brya...@gmail.com wrote: It seems there was a

Re: HBaseClient.call() hang

2012-12-17 Thread Azuryy Yu
Don't increase RS timeout to avoid this issue. what size of your block size? and can you paste your JVM options here? I also met a long GC problem, but I tuned jvm options, it works very well now. On Tue, Dec 18, 2012 at 1:18 AM, Bryan Keller brya...@gmail.com wrote: It seems there was a

Re: Roll of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
Hi Stack and Nick, Thanks for the reply. @Stack: I went through the following link http://hbase.apache.org/book.html#hbase.tmp.dir before posting my query. hbase.tmp.dir Temporary directory on the local filesystem. Change this setting to point to a location more permanent than '/tmp' (The '/tmp'

Re: Wrong input split locations after enabling reverse DNS

2012-12-17 Thread Robert Dyer
That's what I thought too. Except I am running 0.94.2 and this fix was released in 0.90.4. On Mon, Dec 17, 2012 at 5:11 PM, Stack st...@duboce.net wrote: On Sun, Dec 16, 2012 at 1:16 AM, Robert Dyer psyb...@gmail.com wrote: I recently enabled reverse DNS on my test cluster. Now when I run

Re: Roll of hbase.tmp.dir in HBase

2012-12-17 Thread Nick Dimiduk
On Mon, Dec 17, 2012 at 5:20 PM, anil gupta anilgupt...@gmail.com wrote: @Nick: I am using HBase 0.92.1, CompactionTool.java is part of HBase 0.96 as per https://issues.apache.org/jira/browse/HBASE-7253. Fair enough; I grepped against trunk. I have 10 disks on my slave node that will

Re: Wrong input split locations after enabling reverse DNS

2012-12-17 Thread Jean-Daniel Cryans
Maybe TableInputFormatBase.getSplits is missing something similar to HBASE-4109? J-D On Mon, Dec 17, 2012 at 5:26 PM, Robert Dyer psyb...@gmail.com wrote: That's what I thought too. Except I am running 0.94.2 and this fix was released in 0.90.4. On Mon, Dec 17, 2012 at 5:11 PM, Stack

Re: Wrong input split locations after enabling reverse DNS

2012-12-17 Thread Robert Dyer
Seems plausible. A simple grep reveals this: mapreduce/TableInputFormatBase.java: hostName = DNS.reverseDns(ipAddress, this.nameServer); which is not doing the filtering that HBASE-4109 does. Would this typically be filed as a new issue or brought up in comments on the closed issue? On

Re: Wrong input split locations after enabling reverse DNS

2012-12-17 Thread Jean-Daniel Cryans
New issue, the other one is too old. Thx! J-D On Mon, Dec 17, 2012 at 6:39 PM, Robert Dyer rd...@iastate.edu wrote: Seems plausible. A simple grep reveals this: mapreduce/TableInputFormatBase.java: hostName = DNS.reverseDns(ipAddress, this.nameServer); which is not doing the

Re: Roll of hbase.tmp.dir in HBase

2012-12-17 Thread Harsh J
A distributed mode of HBase does not make use of the hbase.tmp.dir in any way. It simply leverages the DataNode's ability to scale over multiple disks and leaves the dirty work to it. Makes sense to be parallelized for beefier standalone instances, but I wonder who uses those and how it may even

Re: HBase 0.94 security configurations

2012-12-17 Thread Bob Futrelle
Thanks for your quick reply. On Mon, Dec 17, 2012 at 11:25 PM, Jimmy Xiang jxi...@cloudera.com wrote: Have you tried IPv4? I can disable IPv6 in Mountain Lion, but all my communication is *within*my own machine, so I don't understand why I'd be messing with IP, since nothing I'm doing is

Re: Role of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
Hi All, Thanks a lot for your helpful inputs. On Mon, Dec 17, 2012 at 8:04 PM, Harsh J ha...@cloudera.com wrote: You're correct - I spoke with only user-data in mind. On Tue, Dec 18, 2012 at 8:52 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: IIRC ZK's data will still go there if HBase

Re: Role of hbase.tmp.dir in HBase

2012-12-17 Thread anil gupta
FYI, I corrected the typo of Roll to Role. Sorry. On Mon, Dec 17, 2012 at 11:18 PM, anil gupta anilgupt...@gmail.com wrote: Hi All, Thanks a lot for your helpful inputs. On Mon, Dec 17, 2012 at 8:04 PM, Harsh J ha...@cloudera.com wrote: You're correct - I spoke with only user-data in