Re: Error: It took too long to wait for the table

2010-11-23 Thread Hari Sreekumar
Update: I see that it usually works when I wait and retry 2-3 times 5 mins. On Wed, Nov 24, 2010 at 10:08 AM, Hari Sreekumar wrote: > Hi, > > What is the cause of this exception? Is there a timeout value that can > be modified to avoid this error? Does this error mean any problem with my > se

Error: It took too long to wait for the table

2010-11-23 Thread Hari Sreekumar
Hi, What is the cause of this exception? Is there a timeout value that can be modified to avoid this error? Does this error mean any problem with my setup, or is it normal to get these errors? In that case, how can I drop this table without messing up the cluster? My table has ~40 columns and

Re: Xceiver problem

2010-11-23 Thread Lucas Nazário dos Santos
Thanks everybody for all replies. I tweaked the parameters as suggested and the error is gone. Lucas On Thu, Nov 18, 2010 at 2:32 PM, Andrew Purtell wrote: > I can get about 1000 regions per node operating comfortably on a 5 node > c1.xlarge EC2 cluster using: > > Somewhere out of /etc/rc.loca

Re: question about meta data query intensity

2010-11-23 Thread Jean-Daniel Cryans
0.89 is ok, 0.90 is still going through the RCs process. I was asking because it's a lot different in the new master. With 10 minutes some things will happen more slowly... like cleaning the split parents. Also after a region server dies, it will take some time until all the regions are assigned d

Re: question about meta data query intensity

2010-11-23 Thread Jack Levin
on 0.89 still... On Tue, Nov 23, 2010 at 4:28 PM, Jack Levin wrote: > if I set it higher, say to 10 minutes, will there be an potential ill effects? > >  -Jack > > On Tue, Nov 23, 2010 at 4:24 PM, Jean-Daniel Cryans > wrote: >> Jack, you didn't upgrade to 0.90 yet right? Then there's a master >

Re: question about meta data query intensity

2010-11-23 Thread Jack Levin
if I set it higher, say to 10 minutes, will there be an potential ill effects? -Jack On Tue, Nov 23, 2010 at 4:24 PM, Jean-Daniel Cryans wrote: > Jack, you didn't upgrade to 0.90 yet right? Then there's a master > background thread that scans .META. every minute... but with that > amount of row

Re: question about meta data query intensity

2010-11-23 Thread Jean-Daniel Cryans
Jack, you didn't upgrade to 0.90 yet right? Then there's a master background thread that scans .META. every minute... but with that amount of rows it's probably best to set that much higher. The config's name is hbase.master.meta.thread.rescanfrequency You should also take a look at your master lo

Re: question about meta data query intensity

2010-11-23 Thread Jack Levin
its taking some queries, but at 3% rate of what we expect to give it later. -Jack On Tue, Nov 23, 2010 at 4:19 PM, Stack wrote: > To be clear, the cluster is not taking queries and .META. is still > being hit at rate of 6k/second? > St.Ack > > On Tue, Nov 23, 2010 at 4:15 PM, Jack Levin wrote:

Re: question about meta data query intensity

2010-11-23 Thread Stack
To be clear, the cluster is not taking queries and .META. is still being hit at rate of 6k/second? St.Ack On Tue, Nov 23, 2010 at 4:15 PM, Jack Levin wrote: > its requests=6204 ... but we have not been loading cluster with > queries at all.  I see that CPU is about 35% used vs other boxes at > us

Re: question about meta data query intensity

2010-11-23 Thread Jack Levin
its requests=6204 ... but we have not been loading cluster with queries at all. I see that CPU is about 35% used vs other boxes at user cpu of 10% or so... So its really CPU load that worries me than the IO. -Jack On Tue, Nov 23, 2010 at 1:55 PM, Stack wrote: > On Tue, Nov 23, 2010 at 11:06 AM,

Re: DemoClient.cpp modifications and question about thrift client / HBase 0.89.20100924

2010-11-23 Thread Stack
On Sun, Nov 21, 2010 at 12:04 AM, Saptarshi Guha wrote: > 0) The DemoClient.cpp (versions, see [a]) does not compile on FC8 with > thrift 0.5. > I modified DemoClient.cpp to make it work. I've attached a diff. (Compiles > on OS > X Snow Leopard and FC 8 (EC2 instance)) > [a] > HBase Change Log > R

Re: HBase not scaling well

2010-11-23 Thread Stack
Thanks for updating the list with your findings Hari. A few of us were baffled there for a while. St.Ack On Tue, Nov 23, 2010 at 9:23 AM, Hari Shankar wrote: > Hi All, > >         I think there was some problem with our earlier setup. When I > tested the 3-node setup, the three machines had diff

Re: question about meta data query intensity

2010-11-23 Thread Stack
On Tue, Nov 23, 2010 at 11:06 AM, Jack Levin wrote: > its REST, and generally no long lived clients, yes, caching of regions > helps however, we expect long tail hits that will be uncached, which > may stress out meta region, that being said, is it possible create > affinity and nail meta region i

Re: managing 5-10 servers

2010-11-23 Thread Jean-Daniel Cryans
I wish I could do a dump of my memory into an ops guide to HBase, but currently I don't think there's such a writeup. What can go wrong... again it depends on your type of usage. With a MR-heavy cluster, it's usually very easy to drive the IO wait through the roof and then you'll end up with GC pa

Re: managing 5-10 servers

2010-11-23 Thread S Ahmed
Are there any writeups on what things to look for? What are some of the things that usually go wrong? Or is that an unfair question :) On Tue, Nov 23, 2010 at 4:22 PM, Jean-Daniel Cryans wrote: > Constant hand holding no, constant monitoring yes. Do setup Ganglia > and preferably Nagios. Then it

Re: managing 5-10 servers

2010-11-23 Thread Jean-Daniel Cryans
Constant hand holding no, constant monitoring yes. Do setup Ganglia and preferably Nagios. Then it depends what you're planning to do with your cluster... here we have 2x 20 machines in production, the one that serves live traffic is pretty much doing it's own thing by itself (although I keep a gan

managing 5-10 servers

2010-11-23 Thread S Ahmed
Hi, How much of a guru do you have to be to keep say 5-10 servers humming? I'm a 1-man shop, and I dream of developing a web application, and scaling will be a core part of the application. Is it feasable for a 1-man operation to manage a 5-10 server hbase cluster? Is it something that requires

RE: question about meta data query intensity

2010-11-23 Thread Jonathan Gray
Not today. If this is of serious concern to you, I'd say drop some comments into HBASE-3171 and I can look at doing that sooner than later on 0.92. But that is still a medium-term fix as it'll take a little time to stabilize that big of a change. And I'd like to drop ROOT first and stabilize

Re: question about meta data query intensity

2010-11-23 Thread Jack Levin
its REST, and generally no long lived clients, yes, caching of regions helps however, we expect long tail hits that will be uncached, which may stress out meta region, that being said, is it possible create affinity and nail meta region into a beefy server or set of beefy servers? -Jack On Tue, N

RE: question about meta data query intensity

2010-11-23 Thread Jonathan Gray
Are you going to have long-lived clients? How are you accessing HBase? REST or Thrift gateways? Caching of region locations should help significantly so that it's only a bottleneck right at the startup of the cluster/gateways/clients. > -Original Message- > From: Jack Levin [mailto:m

Re: question about meta data query intensity

2010-11-23 Thread Jack Levin
my concern is that we plane to have 120 regionservers with 1000 Regions each, so the hits to meta could be quite intense. (why so many regions? we are storing 1 Petabyte of data of images into hbase). -Jack On Tue, Nov 23, 2010 at 9:50 AM, Jonathan Gray wrote: > It is possible that it could be

Re: unable to disable WAL

2010-11-23 Thread Jean-Daniel Cryans
The optional syncer is a background thread in the region server that syncs the HLog every few seconds. The fact that it took 51 seconds to do it could mean two things: - Long GC pause, meaning that the call to sync was happening while the JVM was paused. The actual call without the pause could ha

Re: question about meta data query intensity

2010-11-23 Thread Ted Yu
We're facing the loss of /hbase/root-region-server ZNode: 2010-11-23 17:49:11,288 DEBUG org.apache.zookeeper.ClientCnxn: Reading reply sessionid:0x12c79bef0c10012, packet:: clientPath:null serverPath:null finished:false header:: 160,4 replyHeader:: 160,62,-101 request:: '/hbase/root-region-serve

RE: question about meta data query intensity

2010-11-23 Thread Jonathan Gray
It is possible that it could be a bottleneck but usually is not. Generally production HBase installations have long-lived clients, so the client-side caching is sufficient to reduce the amount of load to META (virtually 0 clean cluster is at steady-state / no region movement). For MapReduce, y

Re: which HBase version to use?

2010-11-23 Thread Gary Helmling
Friso, If you're running the SU HBase on cdh3b3 hadoop, make sure the SU branch includes the patch for HBASE-3194 or be sure to apply it yourself. Without it you'll get compilation errors due to the security changes in cdh3b3. Gary On Tue, Nov 23, 2010 at 12:26 AM, Friso van Vollenhoven < fvan

unable to disable WAL

2010-11-23 Thread Geoff Hendrey
Hi - I've noticed that even though my mapred job disabled the WAL, we still see HLog flushing. In my mapred job I do: setWriteToWal(false). However, I still see this is region server logs: "logFlusher took 51132ms optional sync'ing hlog" I'm observing a pattern of cascading failur

question about meta data query intensity

2010-11-23 Thread Jack Levin
Hello, I am curious if there is a potential bottleneck in .META. ownership by a single region server. Is it possible (safe) to split meta region into several? -Jack

Re: HBase not scaling well

2010-11-23 Thread Hari Shankar
Hi All, I think there was some problem with our earlier setup. When I tested the 3-node setup, the three machines had different OSes (Ubuntu server, Ubuntu desktop and CentOS 5.4). Now, while checking the 4-node and 6 nodes, I had all 3 machines with exactly same OS, java version etc. I a

Re: LazyFetching of Row Results in MapReduce

2010-11-23 Thread Lars George
Hi fnord, See https://issues.apache.org/jira/browse/HBASE-1537 and https://issues.apache.org/jira/browse/HBASE-2673 for details. Not sure when that went in though but you should have that available, no? Lars On Tue, Nov 23, 2010 at 2:48 PM, fnord 99 wrote: > Hi, > > our machines have 24GB of RA

Re: LazyFetching of Row Results in MapReduce

2010-11-23 Thread fnord 99
Hi, our machines have 24GB of RAM (for 8 cores) and HBase gets 6 GBs. The map jobs all have 768 MB memory. Currently we're using CDH3b3. We'll definitely implement my idea of distributing the rows into multiple columns similarly to what Friso said. A comment from somebody who has really wide ro

Re: which HBase version to use?

2010-11-23 Thread Friso van Vollenhoven
Hi All, Thanks for all the feedback. Because I need a 'works right now' version, I am going to go for 0.89- with some patches applied (basically SU's version on top of CDH3b3 Hadoop), with a planned upgrade path to CDH3 when it reaches b4 or final (or any state that I have time for to test on ou