Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Harold Lim
Hi Ted, I iostat my region server and it seems that there is an imbalance in the read requests of the disks. Device:tpsMB_read/sMB_wrtn/sMB_readMB_wrtn xvdap10.40 0.00 0.00 0 0 xvdb429.0011.65

Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Harold Lim
Hi Ted, For some reason, when I try the forked version of YCSB, I can't seem to launch more than 10 threads. I start getting the following errors: 11/06/02 02:17:35 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=xx:2181 sessionTimeout=18 watcher=hconnection

Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Ted Dunning
Zookeeper has an internal limit on number of connections. Which version of hbase are you running? On Wed, Jun 1, 2011 at 11:20 PM, Harold Lim rold...@yahoo.com wrote: Hi Ted, For some reason, when I try the forked version of YCSB, I can't seem to launch more than 10 threads. I start getting

Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Harold Lim
I'm running HBase 0.90.2. -Harold --- On Thu, 6/2/11, Ted Dunning tdunn...@maprtech.com wrote: From: Ted Dunning tdunn...@maprtech.com Subject: Re: How to improve HBase throughput with YCSB? To: user@hbase.apache.org Date: Thursday, June 2, 2011, 2:34 AM Zookeeper has an internal limit on

Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Ted Dunning
Yeah.. there is a bug on that. I am spacing the number right now. And I have to run. On Wed, Jun 1, 2011 at 11:42 PM, Harold Lim rold...@yahoo.com wrote: I'm running HBase 0.90.2. -Harold --- On Thu, 6/2/11, Ted Dunning tdunn...@maprtech.com wrote: From: Ted Dunning

Re: Problem starting up HBase in pseudo distributed mode

2011-06-02 Thread Hari Sreekumar
This got fixed when I removed the 127.0.1.1 entry in /etc/hosts and moved the entries corresponding to it to 127.0.0.1. ( http://permalink.gmane.org/gmane.comp.java.hadoop.hbase.user/18878) hair On Tue, May 31, 2011 at 12:43 AM, Sean Bigdatafun sean.bigdata...@gmail.com wrote: Hi Hari, I am

Problem in adding region server

2011-06-02 Thread praveenesh kumar
hello guys..!! I am a newbie in Hbase. I am trying to setup Hbase cluster on top of my hadoop cluster. My hadoop version is 0.20.2 and I am trying to use Hbase version 0.20.6 Are they compatible with each other ? I am able to run Hbase on my single node.. But whenever I am trying to add my

Re: mslab enabled jvm crash

2011-06-02 Thread Wayne
I have finally been able to spend enough time to digest/test all recommendations and get this under control. I wanted to thank Stack, Jack Levin, and Ted Dunning for their input. Basically our memory was being pushed to the limit and the JVM does not like/can not handle this. We are successfully

Re: mslab enabled jvm crash

2011-06-02 Thread Jeff Whiting
Is there any information from this thread that we should make sure gets into the hbase book? it seem like Wayne went through a lot of work to get good performance and it would be nice if all the information he gleaned from the community were recorded somewhere. If it doesn't make sense to put

Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Stack
It looks like you are managing zk yourself? Default is that zk only allows 10 connections. Up it to 1000 for now. Its maxClientCnxns. St.Ack On Wed, Jun 1, 2011 at 11:42 PM, Harold Lim rold...@yahoo.com wrote: I'm running HBase 0.90.2. -Harold --- On Thu, 6/2/11, Ted Dunning

RE: Problem in adding region server

2011-06-02 Thread Buttler, David
Don't use HBase 0.20.6. Use the current release (0.90.3) Did you configure hbase for distributed operation? What command are you using to start the region server? Are you manually starting region servers individually, or are you running the start-hbase.sh command from your hbase master

Re: Problem in adding region server

2011-06-02 Thread praveenesh kumar
So you mean to say that hadoop 0.20.6 is older version ??? Does it not work well with hadoop 0.20.2 ?? On Thu, Jun 2, 2011 at 9:04 PM, Buttler, David buttl...@llnl.gov wrote: Don't use HBase 0.20.6. Use the current release (0.90.3) Did you configure hbase for distributed operation? What

Re: Problem in adding region server

2011-06-02 Thread Stack
No. Read the manual: http://hbase.apache.org/book/notsoquick.html St.Ack On Thu, Jun 2, 2011 at 8:38 AM, praveenesh kumar praveen...@gmail.com wrote: So you mean to say that hadoop 0.20.6 is older version ??? Does it not work well with hadoop 0.20.2 ?? On Thu, Jun 2, 2011 at 9:04 PM,

Re: mslab enabled jvm crash

2011-06-02 Thread Erik Onnen
I'd be particularly interested how you guys came to the conclusion for increasing block size and how you arrived at the size you chose. For example, what metrics were you looking at that indicated the block size was too small and what tests did you run to arrive at 256k for the correct size?

Re: mslab enabled jvm crash

2011-06-02 Thread Stack
Thanks for writing back to the list Wayne. Hopefully this message hits you before the next CMF does. Would you mind pasting your final JVM args and any other configs you think one of us could use writing up your war story for the 'book' as per Jeff Whiting's suggestion? Good stuff, St.Ack On

Re: mslab enabled jvm crash

2011-06-02 Thread Wayne
Our storefileindex was pushing 3g. We used the hfile tool to see that we had very large keys (50-70 bytes) and small values (5-7 bytes). Jack pointed me to a great Jira about this: https://issues.apache.org/jira/browse/HBASE-3551 . We HAD to increase from the default and we picked 256k to reduce

RE: wrong region exception

2011-06-02 Thread Robert Gonzalez
I'm getting a lot of this on the slave that is doing the latest adds: 2011-06-02 00:33:05,231 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlushe r: Region urlhashv4,7F8537883DDF5230B10AA2CB13182505,1306992752074.71ab4c4527ce7 6d777f78943a86009d2. has too many store files; delaying flush up

RE: wrong region exception

2011-06-02 Thread Robert Gonzalez
Also, notice the output of my copy, where its now stuck on the final line. first column is number of rows, second column is key value: total:90460 7FECD7A2D11FFD850FDC7CA899CA3138 total:90470 7FF0787C8EC28FF760BF0E38BB1F95C8 total:90480 7FF418DDFCB134EFA7F1304762EA4A20

Failure to Launch: hbase-0.90.3 with hadoop-0.20.203.0

2011-06-02 Thread Ratner, Alan S (IS)
I just installed HBase-0.90.3 with Hadoop-0.20.203.0 and I get some sort of zookeeper hiccup followed by an End Of File error. Any suggestions on what I am doing wrong? Alan MASTER LOG FILE Thu Jun 2 10:29:25 EDT 2011 Starting master on hadoop1 ulimit -n 1024 2011-06-02 10:29:26,218 INFO

Re: Failure to Launch: hbase-0.90.3 with hadoop-0.20.203.0

2011-06-02 Thread Jean-Daniel Cryans
The zk stuff is ok, it's just that hadoop1 doesn't have a zk server but hadoop2 does (so review your configuration). You need to replace the hadoop jar since right now you have /hadoop-core-0.20-append-r1056497.jar Like the doc says http://hbase.apache.org/book.html#hadoop It is critical that

Re: How to improve HBase throughput with YCSB?

2011-06-02 Thread Harold Lim
Hi St.Ack, In my setup, Zk is being managed by HBase. I'll try increasing maxClientCnxns. Thanks, Harold --- On Thu, 6/2/11, Stack st...@duboce.net wrote: From: Stack st...@duboce.net Subject: Re: How to improve HBase throughput with YCSB? To: user@hbase.apache.org Date: Thursday, June

Re: mslab enabled jvm crash

2011-06-02 Thread Wayne
JVM w/ 10g Heap settings below. Once we are bored with stability we will try to up the 65 to 70 which seems to be standard. -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=65 -XX:+CMSParallelRemarkEnabled -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:MaxNewSize=128m

RE: wrong region exception

2011-06-02 Thread Robert Gonzalez
Ok, I think I know why it gets stuck there. That's where the hole is in the original table. I skipped past the hole and it is off and running again. Knock on wood! Robert -Original Message- From: Robert Gonzalez [mailto:robert.gonza...@maxpointinteractive.com] Sent: Thursday, June

RE: wrong region exception

2011-06-02 Thread Robert Gonzalez
Here's another clue: the process is taking up lots of cpu time, likes its in some kind of loop, but the output indicates that its stuck on the same section. Robert -Original Message- From: Robert Gonzalez [mailto:robert.gonza...@maxpointinteractive.com] Sent: Thursday, June 02, 2011

RE: wrong region exception

2011-06-02 Thread Robert Gonzalez
And more info. The copy dies on a regionserver failure. Here is the exception when it dies: 2011-06-02 13:29:07,546 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer : ABORTING region server serverName=c1-s06.atxd.maxpointinteractive.com,60020,13 06860799744, load=(requests=0,

Re: wrong region exception

2011-06-02 Thread Stack
So, cluster is OK after the below crash? Regions come up fine on new servers and .META. is fine? Below is interesting in that we failed a split because we could not write an edit to the .META. (how many handlers are you running with? And what is going on on the .META. server at around this time?

RE: wrong region exception

2011-06-02 Thread Robert Gonzalez
First a clarification: everything is happening within the context of a single cluster of 55 machines, there is no inter-cluster copying. I restarted c1-s06, the regionserver that died, and by the way, all new data seems to be going to this server first. Is there a reason for this? This is

Re: wrong region exception

2011-06-02 Thread Stack
It can't get to c1-s19. It timesout trying to connect. Can you figure whats up w/ that? On always going to the same server, is this a case of http://hbase.apache.org/book.html#timeseries? Or perhaps, regions split and go elsewhere but distcp is writing from the src in order? St.Ack On Thu,

RE: How to efficiently join HBase tables?

2011-06-02 Thread Michael Segel
Not to beat a dead horse, but I thought a bit more about the problem. If you want to do this all in HBase using a M/R job... Lets define the following: SELECT * FROM A, B WHERE A.a = B.a AND A.b = B.b AND A.c = xxx AND A.d = yyy AND B.e = zzz Is the sample query. So our join

Timeouts on gets and puts

2011-06-02 Thread Douglas Campbell
Is there an easy way to control this?  some magic setting in the HBAseConfiguration passed to HTable? The problem we're facing is that for each put and get we  block for however long it takes to do the put. The norm put/get is fast or at least impressive to me and yet some puts take 10sec +

RE: Timeouts on gets and puts

2011-06-02 Thread Doug Meil
1) You probably want to upgrade to a more recent version of HBase 2) You probably want to read this: http://hbase.apache.org/book.html#performance Not knowing anything about your cluster size or table design, Put delays can be exacerbated by a number of things. 3) As for the time limit, I

Question from HBase book: HBase currently does not do well with anything about two or three column families

2011-06-02 Thread Leif Wickland
I was reading through the HBase book and came across the following in *6.2. On the number of column families.http://hbase.apache.org/book.html#number.of.cfs * * * *HBase currently does not do well with anything about two or three column families so keep the number of column families in your

Re: Question from HBase book: HBase currently does not do well with anything about two or three column families

2011-06-02 Thread Stack
On Thu, Jun 2, 2011 at 2:40 PM, Leif Wickland leifwickl...@gmail.com wrote: Do you think I should look for ways to reduce the number of CFs? If you can, yes (The book is current -- the work on making hbase do better with more CFs is yet to be done). Good luck, St.Ack

Re: Question from HBase book: HBase currently does not do well with anything about two or three column families

2011-06-02 Thread Vidhyashankar Venkataraman
Is there a JIRA for issuing flushes and compactions on a per column family basis? On 6/2/11 2:48 PM, Stack st...@duboce.net wrote: On Thu, Jun 2, 2011 at 2:40 PM, Leif Wickland leifwickl...@gmail.com wrote: Do you think I should look for ways to reduce the number of CFs? If you can, yes

Re: Question from HBase book: HBase currently does not do well with anything about two or three column families

2011-06-02 Thread Jean-Daniel Cryans
https://issues.apache.org/jira/browse/HBASE-3149 for flushes, not sure about compactions. J-D On Thu, Jun 2, 2011 at 2:57 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: Is there a JIRA for issuing flushes and compactions on a per column family basis? On 6/2/11 2:48 PM, Stack

RE: Question from HBase book: HBase currently does not do well with anything about two or three column families

2011-06-02 Thread Doug Meil
Re: Is that still considered current? Do folks on the list generally agree with that guideline? Yes and yes. HBase runs better with fewer CFs. -Original Message- From: Leif Wickland [mailto:leifwickl...@gmail.com] Sent: Thursday, June 02, 2011 5:41 PM To: user@hbase.apache.org

follow up question on row key schema design

2011-06-02 Thread Sam Seigal
Hi, I am not able to find information regarding the algorithm that decides which region a particular row belongs to in an HBase cluster. Does the algorithm take into account the number of physical nodes ? Where can I find more details about it ? I went through the HBase book and the OpenTSDB

problem about hbase and hadoop version

2011-06-02 Thread 吕鹏
Hi I want to build a hbase cluster in the production environment. Which version of hbase and hadoop is recommended? The apache or the cdh3? Is cdh3 a open source version? In some artical, it says if the two version is not matched, hbase well lose data! thx Peng

HBase question

2011-06-02 Thread King JKing
Dear all, I want to design Follower schema like Twitter. I have 2 design Design 1: userId{//rowkey followerId: time, } Design 2 : [userId][followerId]{//rowkey time: time } I have 2 question: 1. Does HBase support scan data of rowkey by column? 2. Which design is better? I think that

REST API doesn't support checkAndPut

2011-06-02 Thread Henri Chenosky
It seems that HTable atomic operations (e.g., checkAndPut and checkAndDelete) are not supported by the current REST implementation. The RemoteHTable class throws an IOException when these methods are called. Is there any reason why these operations are unsupported in REST? Are any plans to support

Hbase Web UI Interface on hbase 0.90.3 ?

2011-06-02 Thread praveenesh kumar
Hello guys. I just have installed hbase on my hadoop cluster. HMaster,HRegionServer,HQuorum Peer all are working fine.. as I can see these processes running through JPS. Is there any way to know which regionservers are running right and not ? I mean is there some kind of hbase web UI or anyway

Re: Hbase Web UI Interface on hbase 0.90.3 ?

2011-06-02 Thread Stack
See http://hbase.apache.org/book/book.html Let us know if there are holes in the setup it so we can plug them. St.Ack On Thu, Jun 2, 2011 at 10:12 PM, praveenesh kumar praveen...@gmail.com wrote: Hello guys. I just have installed hbase on my hadoop cluster. HMaster,HRegionServer,HQuorum

Re: HBase question

2011-06-02 Thread Jean-Daniel Cryans
I have 2 question: 1. Does HBase support scan data of rowkey by column? You mean secondary indexes? No: http://hbase.apache.org/book.html#secondary.indices 2. Which design is better? I think that design 2 is better when user have large amount of follower. I cover a bunch of designs in this

Re: problem about hbase and hadoop version

2011-06-02 Thread Jean-Daniel Cryans
This is discussed in the book: http://hbase.apache.org/book.html#hadoop J-D On Thu, Jun 2, 2011 at 6:09 PM, 吕鹏 lvpengd...@gmail.com wrote: Hi    I want to build a hbase cluster in the production environment. Which version of hbase and hadoop is recommended? The apache or the cdh3? Is cdh3 a

Re: Hbase Web UI Interface on hbase 0.90.3 ?

2011-06-02 Thread lohit
2011/6/2 praveenesh kumar praveen...@gmail.com Hello guys. I just have installed hbase on my hadoop cluster. HMaster,HRegionServer,HQuorum Peer all are working fine.. as I can see these processes running through JPS. Is there any way to know which regionservers are running right and not ?