from:"Venkatesh"

HBase Shell available through Apache Zeppelin

2016-02-02 Thread Rajat Venkatesh

 I have contributed a feature to access HBase Shell to Apache Zeppelin. The
main advantages are:
- Admins have quick access to HBase shell through the browser.
- Sessions are saved. So you can log in and the current state of
triage/experiments is available.
- On a similar note, standard recipes or sequence of commands can be saved
for the future and run quickly during triage.

The easiest method to get access is to install and run Apache Zeppelin on
HBase master or any other machine where HBase shell works. In our
experience, the HBase interpreter is not a resource hog.

JIRA: https://issues.apache.org/jira/browse/ZEPPELIN-651
Commit:
https://github.com/apache/incubator-zeppelin/commit/1940388e3422b86a322fc82a0e7868ff25126804

Looking forward to feedback and suggestions for improvements.

Rajat Venkatesh
Engg. Lead
Qubole

Add keys to column family in HBase using Python

2015-04-17 Thread Manoj Venkatesh

Dear HBase experts,

I have a Hadoop cluster which has Hive, HBase installed along with other Hadoop 
components.  I am currently exploring ways to automate a data migration process 
from Hive to HBase which involves new columns of data added ever so often.  I 
was successful in creating a HBase table using Hive and load data into the 
HBase table, on these lines I tried to add new columns to the HBase table(from 
Hive) using the alter table syntax and I got the error message, ALTER TABLE 
cannot be used for a non-native table temp_testing.

As an alternative to this I am also trying to do this programmatically using 
Python, I have explored the libraries 
HappyBasehttps://happybase.readthedocs.org/en/latest/index.html and 
starbasehttp://pythonhosted.org//starbase/. These libraries provide 
functionality for creating, deleting and other features but none of these 
provide an option to add a key to a column family. Does anybody know of a 
better way of achieving this with Python, say libraries or through other means.

Thanks in advance,
Manoj

The information transmitted in this email is intended only for the person or 
entity to which it is addressed, and may contain material confidential to Xoom 
Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient(s) is prohibited. If you received this email in error, please contact 
the sender and delete the material from your files.

Re: mapreduce job failure

2011-05-17 Thread Venkatesh

thanks J-D as always
 

 


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, May 17, 2011 8:04 pm
Subject: Re: mapreduce job failure


400 regions a day is way too much, also in 0.20.6 there's a high risk

of collision when you get near the 10 thousands of regions. But that's

most probably not your current issue.



That HDFS message 99% of the time means that the region server went

into GC and when it came back the master already moved the regions

away. Should be pretty obvious in the logs.



As to why the tasks get killed, it's probably related. And since you

are running such an old release you have data loss and if that happens

on the .META. table then you lose metadata about the regions.



To help with GC issues, I suggest you read the multi-part blog post

from Todd: 
http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/



J-D



On Mon, May 16, 2011 at 2:08 PM, Venkatesh vramanatha...@aol.com wrote:

 Thanks J-D



 Using hbase-0.20.6, 49 node cluster



 The map reduce job involve a full table scan...(region size 4 gig)

 The job runs great for 1 week..

 Starts failing after 1 week of data accumulation (about 3000 regions)



  About 400 regions get created per day...



 Can you suggest any tunables at the HBase level. or HDFS level.?



 Also, I've one more issue..when region servers die..Errors below: (any 

suggestion here is helpfull as well)



 org.apache.hadoop.ipc.RemoteException: 
 org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: 

No lease on 
/hbase_data_one_110425/.../compaction.dir/249610074/4534752250560182124 

File does not exist. Holder DFSClient_-398073404 does not have any open files.

at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1332)

at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323)

at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251)

at 
 org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)

at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)

at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)



















 -Original Message-

 From: Jean-Daniel Cryans jdcry...@apache.org

 To: user@hbase.apache.org

 Sent: Fri, May 13, 2011 12:39 am

 Subject: Re: mapreduce job failure





 All that means is that the task stayed in map() for 10 minutes,



 blocked on something.







 If you were scanning an hbase table, and didn't get a new row after 1



 minute, then the scanner would expire. That's orthogonal tho.







 You need to figure what you're blocking on, add logging and try to



 jstack your Child processes for example.







 J-D







 On Thu, May 12, 2011 at 7:21 PM, Venkatesh vramanatha...@aol.com wrote:







  Hi



 Using hbase-0.20.6







 mapreduce job started failing in the map phase (using hbase table as input 

for



 mapper)..(ran fine for a week or so starting with empty tables)..







 task tracker log:











 Task attempt_201105121141_0002_m_000452_0 failed to report status for 600



 seconds. Killing











 Region server log:







 2011-05-12 18:27:39,919 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer:



 Scanner -7857209327501974146 lease expired







 2011-05-12 18:28:29,716 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer:org.apache.hadoop.hbase.UnknownScannerException:



 Name: -7857209327501974146



at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1880)



   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)at



 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)



at java.lang.reflect.Method.invoke(Method.java:597)



at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)



at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)











 2011-05-12 18:28:29,897 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server



 handler 3 on 60020, call next(-78572093275019



 74146, 1) from .:35202: error: 
 org.apache.hadoop.hbase.UnknownScannerException:



 Name: -7857209327501974146



 org.apache.hadoop.hbase.UnknownScannerException: Name: -7857209327501974146

Re: mapreduce job failure

2011-05-16 Thread Venkatesh

Thanks J-D

Using hbase-0.20.6, 49 node cluster

The map reduce job involve a full table scan...(region size 4 gig)
The job runs great for 1 week..
Starts failing after 1 week of data accumulation (about 3000 regions)

 About 400 regions get created per day...

Can you suggest any tunables at the HBase level. or HDFS level.?

Also, I've one more issue..when region servers die..Errors below: (any 
suggestion here is helpfull as well)

org.apache.hadoop.ipc.RemoteException: 
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on 
/hbase_data_one_110425/.../compaction.dir/249610074/4534752250560182124 File 
does not exist. Holder DFSClient_-398073404 does not have any open files.
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1332)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1323)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1251)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)
 

 


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Fri, May 13, 2011 12:39 am
Subject: Re: mapreduce job failure


All that means is that the task stayed in map() for 10 minutes,

blocked on something.



If you were scanning an hbase table, and didn't get a new row after 1

minute, then the scanner would expire. That's orthogonal tho.



You need to figure what you're blocking on, add logging and try to

jstack your Child processes for example.



J-D



On Thu, May 12, 2011 at 7:21 PM, Venkatesh vramanatha...@aol.com wrote:



  Hi

 Using hbase-0.20.6



 mapreduce job started failing in the map phase (using hbase table as input 
 for 

mapper)..(ran fine for a week or so starting with empty tables)..



 task tracker log:





 Task attempt_201105121141_0002_m_000452_0 failed to report status for 600 

seconds. Killing





 Region server log:



 2011-05-12 18:27:39,919 INFO 
 org.apache.hadoop.hbase.regionserver.HRegionServer: 

Scanner -7857209327501974146 lease expired



 2011-05-12 18:28:29,716 ERROR 
 org.apache.hadoop.hbase.regionserver.HRegionServer:org.apache.hadoop.hbase.UnknownScannerException:
  

Name: -7857209327501974146

at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1880)
  

   at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)at 

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)

at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)





 2011-05-12 18:28:29,897 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 

handler 3 on 60020, call next(-78572093275019

 74146, 1) from .:35202: error: 
 org.apache.hadoop.hbase.UnknownScannerException: 

Name: -7857209327501974146

 org.apache.hadoop.hbase.UnknownScannerException: Name: -7857209327501974146

at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1880)

at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)

at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  

   at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)

at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)



 I don't see any error in datanodes



 Appreciate any help

 thanks

 v

mapreduce job failure

2011-05-12 Thread Venkatesh


 Hi
Using hbase-0.20.6

mapreduce job started failing in the map phase (using hbase table as input for 
mapper)..(ran fine for a week or so starting with empty tables)..

task tracker log:


Task attempt_201105121141_0002_m_000452_0 failed to report status for 600 
seconds. Killing

 
Region server log:

2011-05-12 18:27:39,919 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 
-7857209327501974146 lease expired

2011-05-12 18:28:29,716 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer:org.apache.hadoop.hbase.UnknownScannerException:
 Name: -7857209327501974146
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1880)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)


2011-05-12 18:28:29,897 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 3 on 60020, call next(-78572093275019
74146, 1) from .:35202: error: 
org.apache.hadoop.hbase.UnknownScannerException: Name: -7857209327501974146
org.apache.hadoop.hbase.UnknownScannerException: Name: -7857209327501974146
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1880)
at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

I don't see any error in datanodes

Appreciate any help
thanks
v

Re: java.lang.IndexOutOfBoundsException

2011-04-21 Thread Venkatesh


 Thanks..we have the same exact code that process 700 million puts per day in 
0.20.6
from a tomcat servlet (each thread creates new HTable, does 1 put  closes)

..in 0.90.2..we changed just the API whose signature changed (mainly 
HTable)..it crawls
..each/most requests taking well over 2 sec..we can't keep up with even 1/10 th 
of
production load.

everything in the cluster is identical 20 node cluster..


That is impressive performance..from async..Thanks for the tip..i'll give it a 
try (assuming
it would work with 0.90.2)


 


 

 

-Original Message-
From: tsuna tsuna...@gmail.com
To: user@hbase.apache.org
Sent: Wed, Apr 20, 2011 4:30 pm
Subject: Re: java.lang.IndexOutOfBoundsException


On Wed, Apr 20, 2011 at 10:04 AM, Venkatesh vramanatha...@aol.com wrote:

 On 0.90.2, do you all think using HTablePool would help with performance 

problem?



What performance problems are you seeing?



BTW, if you want a thread-safe client that's highly scalable for

high-throughput, multi-threaded applications, look at asynchbase:

http://github.com/stumbleupon/asynchbase

OpenTSDB uses it and I'm able to push 20 edits per second to 3

RegionServers.



-- 

Benoit tsuna Sigoure

Software Engineer @ www.StumbleUpon.com

Re: hbase 0.90.2 - incredibly slow response

2011-04-21 Thread Venkatesh

Thanks St. Ack..
Sorry I had to roll back to 0.20.6..as our system is down way too long..
so..i don't have log rt now..i'll try to recreate in a different machine at a 
later time..

yes..700 mil puts per day
cluster is 20 node (20 datanode+ region server) ..besides that 1 machine with 
HMaster  1 name node, 3 zoo keeper

we do , new HTable, put, close in a multi threaded servlet (tomcat based) 
(HBase configuration object is constructed in init)
same logic works great in 0.20.6..
In 0.90.2 all I changed was retrofit HTable constructor .. it crawls

we rolled back to 0.20.6  it works great again..obviously some major logic 
change in 0.90.2
that requires perhaps different coding practice for client api..If you can shed 
some light that wld be helpfull

My hbase config is pretty much default except region size (using 4gig)

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Wed, Apr 20, 2011 2:11 pm
Subject: Re: hbase 0.90.2 - incredibly slow response


Want to paste your configuration up in pastebin?



Is that 700million puts a day?



Remind us of your cluster size.



Paste some of a regionserver log too.  That can be informative.



St.Ack



On Wed, Apr 20, 2011 at 10:41 AM, Venkatesh vramanatha...@aol.com wrote:

 shell is no problems..ones/twos..i've tried mass puts from shell

 we cant handle our production load (even 1/3 of it)

 700 mill per day is full load..same load we handled with absolutely no issues 

in 0.20.6..

 there is several pause between batch of puts as wel





















 -Original Message-

 From: Stack st...@duboce.net

 To: user@hbase.apache.org

 Sent: Wed, Apr 20, 2011 1:30 pm

 Subject: Re: hbase 0.90.2 - incredibly slow response





 On Tue, Apr 19, 2011 at 11:58 AM, Venkatesh vramanatha...@aol.com wrote:







  I was hoping that too..



 I don't have scripts to generate # requests from shell..I will try that..











 Did you try it?







 Above you seem to say that a simple put of  100 bytes takes 2 seconds



 where in 0.20.6 it took  10 milliseconds.  A put from shell of 100



 bytes is easy enough to do.







 hbase put 'YOUR_TABLE', 'SOME_ROW', 'SOME_COLUMN', 'SOME_STRING_OF_100_BYTES'







 The shell will print out rough numbers on how long it takes to do the



 put (from ruby).











 I did n't pre-create regions in 0.20.6  it handled fine the same load..



 I'll try performance in 0.90.2 by precreating regions..







 Would sharing a single HBaseConfiguration object for all threads hurt



 performance?











 I'd doubt that this is the issue.  It should help usually.







 St.Ack

Re: hbase 0.90.2 - incredibly slow response

2011-04-21 Thread Venkatesh


 Thanks St.Ack
Yes..will try the upgrade in a smaller setup with ..production like load
 will investigate/compare

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Thu, Apr 21, 2011 11:47 pm
Subject: Re: hbase 0.90.2 - incredibly slow response


Sorry to hear you rolled back.



I think its fair to say that going to 0.90.2 usually has things

running faster and more efficiently.  As to why your experience, I'm

not sure what it could be though of course; it sounds likes something

we've not come across before since we passed you all that we could

think of.



Whats your plan now?  Are you going to try the upgrade again?



You might research how your current install is running.  Do what Jack

Levin did this afternoon where he enabled rpc DEBUG for a while to get

a sense of the type of requests and how long hbase is taking to

process them (In his case he found that upping the handlers cured a

slow scan issue).   You could study the 0.20.6 response times and then

when you upgrade to 0.90.2, check what its showing.  That would at

least give us a clue as to where to start digging.



St.Ack



On Thu, Apr 21, 2011 at 8:21 PM, Venkatesh vramanatha...@aol.com wrote:

 Thanks St. Ack..

 Sorry I had to roll back to 0.20.6..as our system is down way too long..

 so..i don't have log rt now..i'll try to recreate in a different machine at a 

later time..



 yes..700 mil puts per day

 cluster is 20 node (20 datanode+ region server) ..besides that 1 machine with 

HMaster  1 name node, 3 zoo keeper



 we do , new HTable, put, close in a multi threaded servlet (tomcat based) 

(HBase configuration object is constructed in init)

 same logic works great in 0.20.6..

 In 0.90.2 all I changed was retrofit HTable constructor .. it crawls



 we rolled back to 0.20.6  it works great again..obviously some major logic 

change in 0.90.2

 that requires perhaps different coding practice for client api..If you can 

shed some light that wld be helpfull



 My hbase config is pretty much default except region size (using 4gig)

















 -Original Message-

 From: Stack st...@duboce.net

 To: user@hbase.apache.org

 Sent: Wed, Apr 20, 2011 2:11 pm

 Subject: Re: hbase 0.90.2 - incredibly slow response





 Want to paste your configuration up in pastebin?







 Is that 700million puts a day?







 Remind us of your cluster size.







 Paste some of a regionserver log too.  That can be informative.







 St.Ack







 On Wed, Apr 20, 2011 at 10:41 AM, Venkatesh vramanatha...@aol.com wrote:



 shell is no problems..ones/twos..i've tried mass puts from shell



 we cant handle our production load (even 1/3 of it)



 700 mill per day is full load..same load we handled with absolutely no issues



 in 0.20.6..



 there is several pause between batch of puts as wel











































 -Original Message-



 From: Stack st...@duboce.net



 To: user@hbase.apache.org



 Sent: Wed, Apr 20, 2011 1:30 pm



 Subject: Re: hbase 0.90.2 - incredibly slow response











 On Tue, Apr 19, 2011 at 11:58 AM, Venkatesh vramanatha...@aol.com wrote:















  I was hoping that too..







 I don't have scripts to generate # requests from shell..I will try that..























 Did you try it?















 Above you seem to say that a simple put of  100 bytes takes 2 seconds







 where in 0.20.6 it took  10 milliseconds.  A put from shell of 100







 bytes is easy enough to do.















 hbase put 'YOUR_TABLE', 'SOME_ROW', 'SOME_COLUMN', 
 'SOME_STRING_OF_100_BYTES'















 The shell will print out rough numbers on how long it takes to do the







 put (from ruby).























 I did n't pre-create regions in 0.20.6  it handled fine the same load..







 I'll try performance in 0.90.2 by precreating regions..















 Would sharing a single HBaseConfiguration object for all threads hurt







 performance?























 I'd doubt that this is the issue.  It should help usually.















 St.Ack

Re: java.lang.IndexOutOfBoundsException

2011-04-20 Thread Venkatesh


 Yeah you  J-D both hit it..
I knew it's bad..I was trying anything  everything to solve the incredibly 
long latency 
with hbase puts on 0.90.2..
 I get ok/better response with batch put.. this was quick  dirty way to 
accumulate puts by sharing
same HTable instance 
Thanks for letting me know..this exception is due to sharing of HTable..

I've to go back to to 0.20.6 since our system is down too long..(starting with 
empty table)

On 0.90.2, do you all think using HTablePool would help with performance 
problem?
thx

 


 

 

-Original Message-
From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org
Sent: Wed, Apr 20, 2011 12:27 pm
Subject: Re: java.lang.IndexOutOfBoundsException


I think HConnectionManager can catch IndexOutOfBoundsException and translate

into a more user-friendly message, informing user about thread-safety.



On Wed, Apr 20, 2011 at 9:11 AM, Ted Yu yuzhih...@gmail.com wrote:



 I have seen this before.

 HTable isn't thread-safe.



 Please describe your usage.



 Thanks





 On Wed, Apr 20, 2011 at 6:03 AM, Venkatesh vramanatha...@aol.com wrote:





  Using hbase-0.90.2..(sigh..) Any tip? thanks





  java.lang.IndexOutOfBoundsException: Index: 4, Size: 3

at java.util.ArrayList.RangeCheck(ArrayList.java:547)

at java.util.ArrayList.remove(ArrayList.java:387)

at

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1257)

at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:822)

at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:678)

at org.apache.hadoop.hbase.client.HTable.put(HTable.java:663)

Re: java.lang.IndexOutOfBoundsException

2011-04-20 Thread Venkatesh

If I use default ..i can't share/pass my HBaseConfiguration object..atleast i 
don't see a constructor/setter..
that would go against previous suggestion 

 

 


 

 

-Original Message-
From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org
Sent: Wed, Apr 20, 2011 1:08 pm
Subject: Re: java.lang.IndexOutOfBoundsException


When using HTablePool, try not to define maxSize yourself - use the default.



On Wed, Apr 20, 2011 at 10:04 AM, Venkatesh vramanatha...@aol.com wrote:





  Yeah you  J-D both hit it..

 I knew it's bad..I was trying anything  everything to solve the incredibly

 long latency

 with hbase puts on 0.90.2..

  I get ok/better response with batch put.. this was quick  dirty way to

 accumulate puts by sharing

 same HTable instance

 Thanks for letting me know..this exception is due to sharing of HTable..



 I've to go back to to 0.20.6 since our system is down too long..(starting

 with empty table)



 On 0.90.2, do you all think using HTablePool would help with performance

 problem?

 thx

















 -Original Message-

 From: Ted Yu yuzhih...@gmail.com

 To: user@hbase.apache.org

 Sent: Wed, Apr 20, 2011 12:27 pm

 Subject: Re: java.lang.IndexOutOfBoundsException





 I think HConnectionManager can catch IndexOutOfBoundsException and

 translate



 into a more user-friendly message, informing user about thread-safety.







 On Wed, Apr 20, 2011 at 9:11 AM, Ted Yu yuzhih...@gmail.com wrote:







  I have seen this before.



  HTable isn't thread-safe.



 



  Please describe your usage.



 



  Thanks



 



 



  On Wed, Apr 20, 2011 at 6:03 AM, Venkatesh vramanatha...@aol.com

 wrote:



 



 



   Using hbase-0.90.2..(sigh..) Any tip? thanks



 



 



   java.lang.IndexOutOfBoundsException: Index: 4, Size: 3



 at java.util.ArrayList.RangeCheck(ArrayList.java:547)



 at java.util.ArrayList.remove(ArrayList.java:387)



 at



 

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1257)



 at

 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:822)



 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:678)



 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:663)

Re: java.lang.IndexOutOfBoundsException

2011-04-20 Thread Venkatesh


 

 sorry..yeah..that's dumb of me..clearly i'm not thinking anything..just 
frustrated with upgrade
thx


 

 

-Original Message-
From: Ted Yu yuzhih...@gmail.com
To: user@hbase.apache.org
Sent: Wed, Apr 20, 2011 1:24 pm
Subject: Re: java.lang.IndexOutOfBoundsException


I meant specifying Integer.MAX_VALUE as maxSize along with config.



On Wed, Apr 20, 2011 at 10:17 AM, Venkatesh vramanatha...@aol.com wrote:



 If I use default ..i can't share/pass my HBaseConfiguration object..atleast

 i don't see a constructor/setter..

 that would go against previous suggestion





















 -Original Message-

 From: Ted Yu yuzhih...@gmail.com

 To: user@hbase.apache.org

 Sent: Wed, Apr 20, 2011 1:08 pm

 Subject: Re: java.lang.IndexOutOfBoundsException





 When using HTablePool, try not to define maxSize yourself - use the

 default.







 On Wed, Apr 20, 2011 at 10:04 AM, Venkatesh vramanatha...@aol.com wrote:







 



   Yeah you  J-D both hit it..



  I knew it's bad..I was trying anything  everything to solve the

 incredibly



  long latency



  with hbase puts on 0.90.2..



   I get ok/better response with batch put.. this was quick  dirty way to



  accumulate puts by sharing



  same HTable instance



  Thanks for letting me know..this exception is due to sharing of HTable..



 



  I've to go back to to 0.20.6 since our system is down too long..(starting



  with empty table)



 



  On 0.90.2, do you all think using HTablePool would help with performance



  problem?



  thx



 



 



 



 



 



 



 



 



  -Original Message-



  From: Ted Yu yuzhih...@gmail.com



  To: user@hbase.apache.org



  Sent: Wed, Apr 20, 2011 12:27 pm



  Subject: Re: java.lang.IndexOutOfBoundsException



 



 



  I think HConnectionManager can catch IndexOutOfBoundsException and



  translate



 



  into a more user-friendly message, informing user about thread-safety.



 



 



 



  On Wed, Apr 20, 2011 at 9:11 AM, Ted Yu yuzhih...@gmail.com wrote:



 



 



 



   I have seen this before.



 



   HTable isn't thread-safe.



 



  



 



   Please describe your usage.



 



  



 



   Thanks



 



  



 



  



 



   On Wed, Apr 20, 2011 at 6:03 AM, Venkatesh vramanatha...@aol.com



  wrote:



 



  



 



  



 



Using hbase-0.90.2..(sigh..) Any tip? thanks



 



  



 



  



 



java.lang.IndexOutOfBoundsException: Index: 4, Size: 3



 



  at java.util.ArrayList.RangeCheck(ArrayList.java:547)



 



  at java.util.ArrayList.remove(ArrayList.java:387)



 



  at



 



  



 

 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchOfPuts(HConnectionManager.java:1257)



 



  at



  org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:822)



 



  at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:678)



 



  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:663)

Re: hbase 0.90.2 - incredibly slow response

2011-04-20 Thread Venkatesh

shell is no problems..ones/twos..i've tried mass puts from shell
we cant handle our production load (even 1/3 of it)
700 mill per day is full load..same load we handled with absolutely no issues 
in 0.20.6..
there is several pause between batch of puts as wel

 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Wed, Apr 20, 2011 1:30 pm
Subject: Re: hbase 0.90.2 - incredibly slow response


On Tue, Apr 19, 2011 at 11:58 AM, Venkatesh vramanatha...@aol.com wrote:



  I was hoping that too..

 I don't have scripts to generate # requests from shell..I will try that..





Did you try it?



Above you seem to say that a simple put of  100 bytes takes 2 seconds

where in 0.20.6 it took  10 milliseconds.  A put from shell of 100

bytes is easy enough to do.



hbase put 'YOUR_TABLE', 'SOME_ROW', 'SOME_COLUMN', 'SOME_STRING_OF_100_BYTES'



The shell will print out rough numbers on how long it takes to do the

put (from ruby).





 I did n't pre-create regions in 0.20.6  it handled fine the same load..

 I'll try performance in 0.90.2 by precreating regions..



 Would sharing a single HBaseConfiguration object for all threads hurt 

performance?





I'd doubt that this is the issue.  It should help usually.



St.Ack

hbase 0.90.2 - incredibly slow response

2011-04-19 Thread Venkatesh


 
Just upgraded to 0.90.2 from 0.20.6..Doing a simple put to  table ( 100 bytes 
per put)..
Only code change was to retrofit the HTable API to work with 0.90.2 

Initializing HBaseConfiguration in servlet.init()... reusing that config for 
HTable constructor  doing put

Performance is very slow 90% of requests are well over 2 sec..(With 0.20.6, 90% 
use to be  10 milli sec)

I did run set_meta_memstore_size.rb as per the book..

Any help to debug is appreciated..I also see periodic pauses between hbase puts

thanks
v

Re: hbase 0.90.2 - incredibly slow response

2011-04-19 Thread Venkatesh


 I was hoping that too..
I don't have scripts to generate # requests from shell..I will try that..

I did n't pre-create regions in 0.20.6  it handled fine the same load..
I'll try performance in 0.90.2 by precreating regions..

Would sharing a single HBaseConfiguration object for all threads hurt 
performance?

 frustrating..thanks for your help


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Tue, Apr 19, 2011 1:40 pm
Subject: Re: hbase 0.90.2 - incredibly slow response


0.90.2 should be faster.



Running same query from shell, it gives you same lag?



St.Ack



On Tue, Apr 19, 2011 at 10:35 AM, Venkatesh vramanatha...@aol.com wrote:





 Just upgraded to 0.90.2 from 0.20.6..Doing a simple put to  table ( 100 
 bytes 

per put)..

 Only code change was to retrofit the HTable API to work with 0.90.2



 Initializing HBaseConfiguration in servlet.init()... reusing that config for 

HTable constructor  doing put



 Performance is very slow 90% of requests are well over 2 sec..(With 0.20.6, 

90% use to be  10 milli sec)



 I did run set_meta_memstore_size.rb as per the book..



 Any help to debug is appreciated..I also see periodic pauses between hbase 

puts



 thanks

 v

Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job

2011-04-13 Thread Venkatesh

Thanks J-D
I made sure to pass conf objects around in places where I was n't..
will give it a try

 

 


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Apr 12, 2011 6:40 pm
Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job


Yes there are a few places like that. Also when you create new

HTables, you should also close their connections (this is not done in

htable.close).



See HTable's javadoc which says:



Instances of HTable passed the same Configuration instance will share

connections to servers out on the cluster and to the zookeeper

ensemble as well as caches of region locations. This is usually a

*good* thing. This happens because they will all share the same

underlying HConnection instance. See HConnectionManager for more on

how this mechanism works.



and it points to HCM which has more information:

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html



J-D



On Tue, Apr 12, 2011 at 3:09 PM, Ruben Quintero rfq_...@yahoo.com wrote:

 I'm running into the same issue, but did some poking around and it seems that

 Zookeeper connections are being left open by an HBase internal.



 Basically, I'm running a mapreduce job within another program, and noticed in

 the logs that every time the job is run, a connection is open, but I never see

 it closed again. The connection is opened within the job.submit().



 I looked closer and checked the jstack after running it for just under an 

hour,

 and sure enough there are a ton of Zookeeper threads just sitting there. 

Here's

 a pastebin link: http://pastebin.com/MccEuvrc



 I'm running 0.90.0 right now.



 - Ruben













 

 From: Jean-Daniel Cryans jdcry...@apache.org

 To: user@hbase.apache.org

 Sent: Tue, April 12, 2011 4:23:05 PM

 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job



 It's more in the vain of

 https://issues.apache.org/jira/browse/HBASE-3755 and

 https://issues.apache.org/jira/browse/HBASE-3771



 Basically 0.90 has a regression regarding the handling of zookeeper

 connections that make it that you have to be super careful not to have

 more than 30 per machine (each new Configuration is one new ZK

 connection). Upping your zookeeper max connection config should get

 rid of your issues since you only get it occasionally.



 J-D



 On Tue, Apr 12, 2011 at 7:59 AM, Venkatesh vramanatha...@aol.com wrote:



  I get this occasionally..(not all the time)..Upgrading from 0.20.6 to 0.90.2

 Is this issue same as this JIRA

 https://issues.apache.org/jira/browse/HBASE-3578



 I'm using HBaseConfiguration.create()  setting that in job

 thx

 v





  2011-04-12 02:13:06,870 ERROR Timer-0

org.apache.hadoop.hbase.mapreduce.TableInputFormat -

org.apache.hadoop.hbase.ZooKeeperConnectionException:

org.apache.hadoop.hbase.ZooKeeperConnectionException:

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode 

=

ConnectionLoss for /hbaseat

org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1000)



at

org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:303)



at

org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:294)



at

org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)

at org.apache.hadoop.hbase.client.HTable.init(HTable.java:167)

at org.apache.hadoop.hbase.client.HTable.init(HTable.java:145)

at

org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:91)



at

org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)

at

org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)

at

 org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)

at

org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:448)

Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job

2011-04-13 Thread Venkatesh

Reuben:
Yes..I've the exact same issue now.. I'm also kicking off from another jvm 
that runs for ever..
I don't have an alternate solution..either modify hbase code (or) modify my 
code to kick off
as a standalone jvm (or) hopefully 0.90.3 release soon :)
J-D/St.Ack may have some suggestions

 
V

 


 

 

-Original Message-
From: Ruben Quintero rfq_...@yahoo.com
To: user@hbase.apache.org
Sent: Wed, Apr 13, 2011 2:39 pm
Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job


The problem I'm having is in getting the conf that is used to init the table 

within TableInputFormat. That's the one that's leaving open ZK connections for 

me.



Following the code through, TableInputFormat initializes its HTable with new 

Configuration(new JobConf(conf)), where conf is the config I pass in via job 

initiation. I don't see a way of getting the initalized TableInputFormat in 

order to then get its table and its config to be able to properly close that 

connection. Cloned configs don't appear to produce similar hashes, either. The 

only other option I'm left with is closing all connections, but that disrupts 

things across the board. 





For MapReduce jobs run in their own JVM, this wouldn't be much of an issue, as 

the connection would just be closed on completion, but in my case (our code 

triggers the jobs internally), they simply pile up until the ConnectionLoss 
hits 



due to too many ZK connections.



Am I missing a way to get that buried table's config, or another way to kill 
the 



orphaned connections?



- Ruben









From: Venkatesh vramanatha...@aol.com

To: user@hbase.apache.org

Sent: Wed, April 13, 2011 10:20:50 AM

Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job



Thanks J-D

I made sure to pass conf objects around in places where I was n't..

will give it a try





















-Original Message-

From: Jean-Daniel Cryans jdcry...@apache.org

To: user@hbase.apache.org

Sent: Tue, Apr 12, 2011 6:40 pm

Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job





Yes there are a few places like that. Also when you create new



HTables, you should also close their connections (this is not done in



htable.close).







See HTable's javadoc which says:







Instances of HTable passed the same Configuration instance will share



connections to servers out on the cluster and to the zookeeper



ensemble as well as caches of region locations. This is usually a



*good* thing. This happens because they will all share the same



underlying HConnection instance. See HConnectionManager for more on



how this mechanism works.







and it points to HCM which has more information:



http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html









J-D







On Tue, Apr 12, 2011 at 3:09 PM, Ruben Quintero rfq_...@yahoo.com wrote:



 I'm running into the same issue, but did some poking around and it seems that



 Zookeeper connections are being left open by an HBase internal.







 Basically, I'm running a mapreduce job within another program, and noticed in



 the logs that every time the job is run, a connection is open, but I never see



 it closed again. The connection is opened within the job.submit().







 I looked closer and checked the jstack after running it for just under an 



hour,



 and sure enough there are a ton of Zookeeper threads just sitting there. 



Here's



 a pastebin link: http://pastebin.com/MccEuvrc







 I'm running 0.90.0 right now.







 - Ruben



























 



 From: Jean-Daniel Cryans jdcry...@apache.org



 To: user@hbase.apache.org



 Sent: Tue, April 12, 2011 4:23:05 PM



 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job







 It's more in the vain of



 https://issues.apache.org/jira/browse/HBASE-3755 and



 https://issues.apache.org/jira/browse/HBASE-3771







 Basically 0.90 has a regression regarding the handling of zookeeper



 connections that make it that you have to be super careful not to have



 more than 30 per machine (each new Configuration is one new ZK



 connection). Upping your zookeeper max connection config should get



 rid of your issues since you only get it occasionally.







 J-D







 On Tue, Apr 12, 2011 at 7:59 AM, Venkatesh vramanatha...@aol.com wrote:







  I get this occasionally..(not all the time)..Upgrading from 0.20.6 to 0.90.2



 Is this issue same as this JIRA



 https://issues.apache.org/jira/browse/HBASE-3578







 I'm using HBaseConfiguration.create()  setting that in job



 thx



 v











  2011-04-12 02:13:06,870 ERROR Timer-0



org.apache.hadoop.hbase.mapreduce.TableInputFormat -



org.apache.hadoop.hbase.ZooKeeperConnectionException:



org.apache.hadoop.hbase.ZooKeeperConnectionException:



org.apache.zookeeper.KeeperException

Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job

2011-04-13 Thread Venkatesh

Will do..I'll set it to 2000 as per JIRA..
Do we need a periodic bounce? ..because if this error comes up..only way I get 
the mapreduce to work
is bounce.

 

 


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Wed, Apr 13, 2011 3:22 pm
Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job


Like I said, it's a zookeeper configuration that you can change. If

hbase is managing your zookeeper then set

hbase.zookeeper.property.maxClientCnxns to something higher than 30

and restart the zk server (can be done while hbase is running).



J-D



On Wed, Apr 13, 2011 at 12:04 PM, Venkatesh vramanatha...@aol.com wrote:

 Reuben:

 Yes..I've the exact same issue now.. I'm also kicking off from another jvm 

that runs for ever..

 I don't have an alternate solution..either modify hbase code (or) modify my 

code to kick off

 as a standalone jvm (or) hopefully 0.90.3 release soon :)

 J-D/St.Ack may have some suggestions





 V

















 -Original Message-

 From: Ruben Quintero rfq_...@yahoo.com

 To: user@hbase.apache.org

 Sent: Wed, Apr 13, 2011 2:39 pm

 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job





 The problem I'm having is in getting the conf that is used to init the table



 within TableInputFormat. That's the one that's leaving open ZK connections for



 me.







 Following the code through, TableInputFormat initializes its HTable with new



 Configuration(new JobConf(conf)), where conf is the config I pass in via job



 initiation. I don't see a way of getting the initalized TableInputFormat in



 order to then get its table and its config to be able to properly close that



 connection. Cloned configs don't appear to produce similar hashes, either. The



 only other option I'm left with is closing all connections, but that disrupts



 things across the board.











 For MapReduce jobs run in their own JVM, this wouldn't be much of an issue, as



 the connection would just be closed on completion, but in my case (our code



 triggers the jobs internally), they simply pile up until the ConnectionLoss 

hits







 due to too many ZK connections.







 Am I missing a way to get that buried table's config, or another way to kill 

the







 orphaned connections?







 - Ruben















 



 From: Venkatesh vramanatha...@aol.com



 To: user@hbase.apache.org



 Sent: Wed, April 13, 2011 10:20:50 AM



 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job







 Thanks J-D



 I made sure to pass conf objects around in places where I was n't..



 will give it a try











































 -Original Message-



 From: Jean-Daniel Cryans jdcry...@apache.org



 To: user@hbase.apache.org



 Sent: Tue, Apr 12, 2011 6:40 pm



 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job











 Yes there are a few places like that. Also when you create new







 HTables, you should also close their connections (this is not done in







 htable.close).















 See HTable's javadoc which says:















 Instances of HTable passed the same Configuration instance will share







 connections to servers out on the cluster and to the zookeeper







 ensemble as well as caches of region locations. This is usually a







 *good* thing. This happens because they will all share the same







 underlying HConnection instance. See HConnectionManager for more on







 how this mechanism works.















 and it points to HCM which has more information:







 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html



















 J-D















 On Tue, Apr 12, 2011 at 3:09 PM, Ruben Quintero rfq_...@yahoo.com wrote:







 I'm running into the same issue, but did some poking around and it seems that







 Zookeeper connections are being left open by an HBase internal.















 Basically, I'm running a mapreduce job within another program, and noticed in







 the logs that every time the job is run, a connection is open, but I never 

see







 it closed again. The connection is opened within the job.submit().















 I looked closer and checked the jstack after running it for just under an







 hour,







 and sure enough there are a ton of Zookeeper threads just sitting there.







 Here's







 a pastebin link: http://pastebin.com/MccEuvrc















 I'm running 0.90.0 right now.















 - Ruben























































 







 From: Jean-Daniel Cryans jdcry...@apache.org







 To: user@hbase.apache.org







 Sent: Tue, April 12, 2011 4:23:05 PM







 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job















 It's more in the vain

Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job

2011-04-13 Thread Venkatesh

deleteAllConnections works well for my case..I can live with this but not with 
connection leaks
thanks for the idea

Venkatesh

 


 

 

-Original Message-
From: Ruben Quintero rfq_...@yahoo.com
To: user@hbase.apache.org
Sent: Wed, Apr 13, 2011 4:25 pm
Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job


Venkatesh, I guess the two quick and dirty solutions are:



- Call deleteAllConnections(bool) at the end of your MapReduce jobs, or 

periodically. If you have no other tables or pools, etc. open, then no problem. 

If you do, they'll start throwing IOExceptions, but you can re-instantiate them 

with a new config and then continue as usual. (You do have to change the config 

or it'll simply grab the closed, cached one from the HCM).



- As J-D said, subclasss TIF and basically copy the old setConf, except don't 

clone the conf that gets sent to the table.



Each one has a downside and are definitely not ideal, but if you either don't 

modify the config in your job or don't have any other important hbase 

connections, then you can use the appropriate one. 





Thanks for the assistance, J-D. It's great that these forums are active and 

helpful.



- Ruben











From: Jean-Daniel Cryans jdcry...@apache.org

To: user@hbase.apache.org

Sent: Wed, April 13, 2011 3:50:42 PM

Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job



Yeah for a JVM running forever it won't work.



If you know for a fact that the configuration passed to TIF won't be

changed then you can subclass it and override setConf to not clone the

conf.



J-D



On Wed, Apr 13, 2011 at 12:45 PM, Ruben Quintero rfq_...@yahoo.com wrote:

 The problem is the connections are never closed... so they just keep piling up

 until it hits the max. My max is at 400 right now, so after 14-15 hours of

 running, it gets stuck in an endless connection retry.



 I saw that the HConnectionManager will kick older HConnections out, but the

 problem is that their ZooKeeper threads continue on. Those need to be 

explicitly

 closed.







 Again, this is only an issue inside JVMs set to run forever, like Venkatesh

 said, because that's when the orphaned ZK connections will have a chance to

 build up to whatever your maximum is. Setting that higher and higher is just

 prolonging uptime before the eventual crash. It's essentially a memory

 (connection) leak within TableInputFormat, since there is no way that I can 

see

 to properly access and close those spawned connections.



 One question for you, JD: Inside of TableInputFormat.setConf, does the

 Configuration need to be cloned? (i.e. setHTable(new HTable(new

 Configuration(conf), tableName)); ). I'm guessing this is to prevent changes

 within the job from affecting the table and vice-versa...but if it weren't

 cloned, then you could use the job configuration (job.getConfiguration()) to

 close the connection



 Other quick fixes that I can think of, none of which are very pretty:

 1 - Just call deleteAllConnections(bool), and have any other processes using

 HConnections recover from that.

 2 - Make the static HBASE_INSTANCES map accessible (public) then you could

 iterate through open connections and try to match configs



 Venkatesh - unless you have other processes in your JVM accessing HBase (I 

have

 one), #1 might be the best bet.



 - Ruben







 

 From: Jean-Daniel Cryans jdcry...@apache.org

 To: user@hbase.apache.org

 Sent: Wed, April 13, 2011 3:22:48 PM

 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job



 Like I said, it's a zookeeper configuration that you can change. If

 hbase is managing your zookeeper then set

 hbase.zookeeper.property.maxClientCnxns to something higher than 30

 and restart the zk server (can be done while hbase is running).



 J-D



 On Wed, Apr 13, 2011 at 12:04 PM, Venkatesh vramanatha...@aol.com wrote:

 Reuben:

 Yes..I've the exact same issue now.. I'm also kicking off from another jvm

that runs for ever..

 I don't have an alternate solution..either modify hbase code (or) modify my

code to kick off

 as a standalone jvm (or) hopefully 0.90.3 release soon :)

 J-D/St.Ack may have some suggestions





 V

















 -Original Message-

 From: Ruben Quintero rfq_...@yahoo.com

 To: user@hbase.apache.org

 Sent: Wed, Apr 13, 2011 2:39 pm

 Subject: Re: hbase -0.90.x upgrade - zookeeper exception in mapreduce job





 The problem I'm having is in getting the conf that is used to init the table



 within TableInputFormat. That's the one that's leaving open ZK connections 

for



 me.







 Following the code through, TableInputFormat initializes its HTable with new



 Configuration(new JobConf(conf)), where conf is the config I pass in via job



 initiation. I don't see a way of getting the initalized TableInputFormat in



 order to then get its table

hbase -0.90.x upgrade - zookeeper exception in mapreduce job

2011-04-12 Thread Venkatesh


 I get this occasionally..(not all the time)..Upgrading from 0.20.6 to 0.90.2
Is this issue same as this JIRA 
https://issues.apache.org/jira/browse/HBASE-3578

I'm using HBaseConfiguration.create()  setting that in job
thx
v


 2011-04-12 02:13:06,870 ERROR Timer-0 
org.apache.hadoop.hbase.mapreduce.TableInputFormat - 
org.apache.hadoop.hbase.ZooKeeperConnectionException: 
org.apache.hadoop.hbase.ZooKeeperConnectionException: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hbaseat 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1000)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:303)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.init(HConnectionManager.java:294)
at 
org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
at org.apache.hadoop.hbase.client.HTable.init(HTable.java:167)
at org.apache.hadoop.hbase.client.HTable.init(HTable.java:145)
at 
org.apache.hadoop.hbase.mapreduce.TableInputFormat.setConf(TableInputFormat.java:91)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:882)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:448)

Re: zookeper warning with 0.90.1 hbase

2011-04-08 Thread Venkatesh

Thanks St.Ack
Yes..I see these when map-reduce job is complete..but not always..I'll ignore 
thanks..Getting close to 0.90.1 upgrade

 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Cc: Venkatesh vramanatha...@aol.com
Sent: Thu, Apr 7, 2011 11:55 pm
Subject: Re: zookeper warning with 0.90.1 hbase


They happen on the end of a map task or on shutdown?  If so, yes,

ignore (or, if you want to have nice clean shutdown, figure how

Session 0x0 was set up -- was it you -- and call appropriate close in

time).



St.Ack



On Thu, Apr 7, 2011 at 6:33 PM, Venkatesh vramanatha...@aol.com wrote:



  I see lot of these warnings..everything seems to be working otherwise..Is 

this something that can be ignored?





  2011-04-07 21:29:15,032 WARN Timer-0-SendThread(..:2181) 
 org.apache.zookeeper.ClientCnxn 

- Session 0x0 for server :2181, unexpected error, closing socket connection 

and attempting reconnect

 java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcher.read0(Native Method)

at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)

at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)

at sun.nio.ch.IOUtil.read(IOUtil.java:200)

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)

at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:858)

at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130)





 2011-04-07 21:29:15,032 DEBUG Timer-0-SendThread(..:2181) 
 org.apache.zookeeper.ClientCnxn 

- Ignoring exception during shutdown input

 java.net.SocketException: Transport endpoint is not connected

at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)

at 
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)

at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)

at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1205)

at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:11

zookeper warning with 0.90.1 hbase

2011-04-07 Thread Venkatesh


 I see lot of these warnings..everything seems to be working otherwise..Is this 
something that can be ignored?


 2011-04-07 21:29:15,032 WARN Timer-0-SendThread(..:2181) 
org.apache.zookeeper.ClientCnxn - Session 0x0 for server :2181, unexpected 
error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
at sun.nio.ch.IOUtil.read(IOUtil.java:200)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:858)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130)


2011-04-07 21:29:15,032 DEBUG Timer-0-SendThread(..:2181) 
org.apache.zookeeper.ClientCnxn - Ignoring exception during shutdown input
java.net.SocketException: Transport endpoint is not connected
at sun.nio.ch.SocketChannelImpl.shutdown(Native Method)
at 
sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
at 
org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1205)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:11

Re: row_counter map reduce job 0.90.1

2011-04-04 Thread Venkatesh

Sorry about this..It was indeed an environment issue..my core-site.xml was 
pointing to wrong hadoop
thanks for the tips

 

 


 

 

-Original Message-
From: Venkatesh vramanatha...@aol.com
To: user@hbase.apache.org
Sent: Fri, Apr 1, 2011 4:51 pm
Subject: Re: row_counter map reduce job  0.90.1




 Yeah.. I tried that as well as what Ted suggested..It can't find hadoop jar

Hadoop map reduce jobs works fine ..it's just hbase map reduce jobs fails with 

this error

tx



 





 



 



-Original Message-

From: Stack st...@duboce.net

To: user@hbase.apache.org

Sent: Fri, Apr 1, 2011 12:39 pm

Subject: Re: row_counter map reduce job  0.90.1





Does where you are running from have a build/classes dir and a



hadoop-0.20.2-core.jar at top level?  If so, try cleaning out the



build/classes.  Also, could try something like this:







HADOOP_CLASSPATH=/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT-tests.jar:/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT.jar:`/home/stack/hbase-0.90.2-SNAPSHOT/bin/hbase



classpath` ./bin/hadoop jar



/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT.jar rowcounter



usertable







... only make sure the hadoop jar is in HADOOP_CLASSPATH.







But you shouldn't have to do the latter at least.  Compare where it



works to where it doesn't.  Something is different.







St.Ack







On Fri, Apr 1, 2011 at 9:26 AM, Venkatesh vramanatha...@aol.com wrote:



 Definitely yes..It'all referenced in -classpath option of jvm of 



tasktracker/jobtracker/datanode/namenode..



  file does exist in the cluster..







 But the error I get is on the client



 File 
 /home/hdfs/tmp/mapred/system/job_201103311630_0027/libjars/hadoop-0.20.2-core.jar
  





does not exist.



at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)



at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)



at 
 org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)



at 
 org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:629)



at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)



at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)







 So, in theory in should n't expect from client ..correct?







 This is the only that is stopping me in moving to 0.90.1











































 -Original Message-



 From: Stack st...@duboce.net



 To: user@hbase.apache.org



 Sent: Fri, Apr 1, 2011 12:19 pm



 Subject: Re: row_counter map reduce job  0.90.1











 On Fri, Apr 1, 2011 at 9:06 AM, Venkatesh vramanatha...@aol.com wrote:







  I'm able to run this job from the hadoop machine (where job  task tracker







 also runs)







 /hadoop jar /home/maryama/hbase-0.90.1/hbase-0.90.1.jar rowcounter 



usertable















 But, I'm not able to run the same job from







 a) hbase client machine (full hbase  hadoop installed)







 b) hbase server machines (ditto)















 Get







 File 
 /home/.../hdfs/tmp/mapred/system/job_201103311630_0024/libjars/hadoop-0.20.2-core.jar







 does not exist.























 Is that jar present on the cluster?







 St.Ack

Re: HBase wiki updated

2011-04-03 Thread Venkatesh


 

 


 A big thankyou from a hbase user (sorry for the spam..but deserves thanks)

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Sat, Apr 2, 2011 3:51 pm
Subject: Re: HBase wiki updated


2 Internets for you Doug, that's awesome!



Thx



J-D

On Apr 2, 2011 11:59 AM, Doug Meil doug.m...@explorysmedical.com wrote:

 Hi there everybody-



 Just thought I'd let everybody know about this... Stack and I have been

working on updating the HBase book and porting portions of the very-out-date

HBase wiki to the HBase book. These two pages...



 http://wiki.apache.org/hadoop/Hbase/DesignOverview

 http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture



 ... now just have a 1-liner and a link about looking in the HBase book (

http://hbase.apache.org/book.html).





 Doug The Documenter Meil

row_counter map reduce job 0.90.1

2011-04-01 Thread Venkatesh


 

 I'm able to run this job from the hadoop machine (where job  task tracker 
also runs)
/hadoop jar /home/maryama/hbase-0.90.1/hbase-0.90.1.jar rowcounter usertable

But, I'm not able to run the same job from
a) hbase client machine (full hbase  hadoop installed)
b) hbase server machines (ditto)

Get 
File 
/home/.../hdfs/tmp/mapred/system/job_201103311630_0024/libjars/hadoop-0.20.2-core.jar
 does not exist.

Any idea how this jar file get packaged/where is it looking for?

thanks
v

Re: row_counter map reduce job 0.90.1

2011-04-01 Thread Venkatesh


 Yeah.. I tried that as well as what Ted suggested..It can't find hadoop jar
Hadoop map reduce jobs works fine ..it's just hbase map reduce jobs fails with 
this error
tx

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Fri, Apr 1, 2011 12:39 pm
Subject: Re: row_counter map reduce job  0.90.1


Does where you are running from have a build/classes dir and a

hadoop-0.20.2-core.jar at top level?  If so, try cleaning out the

build/classes.  Also, could try something like this:



HADOOP_CLASSPATH=/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT-tests.jar:/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT.jar:`/home/stack/hbase-0.90.2-SNAPSHOT/bin/hbase

classpath` ./bin/hadoop jar

/home/stack/hbase-0.90.2-SNAPSHOT/hbase-0.90.2-SNAPSHOT.jar rowcounter

usertable



... only make sure the hadoop jar is in HADOOP_CLASSPATH.



But you shouldn't have to do the latter at least.  Compare where it

works to where it doesn't.  Something is different.



St.Ack



On Fri, Apr 1, 2011 at 9:26 AM, Venkatesh vramanatha...@aol.com wrote:

 Definitely yes..It'all referenced in -classpath option of jvm of 

tasktracker/jobtracker/datanode/namenode..

  file does exist in the cluster..



 But the error I get is on the client

 File 
 /home/hdfs/tmp/mapred/system/job_201103311630_0027/libjars/hadoop-0.20.2-core.jar
  

does not exist.

at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)

at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)

at 
 org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)

at 
 org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:629)

at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)



 So, in theory in should n't expect from client ..correct?



 This is the only that is stopping me in moving to 0.90.1





















 -Original Message-

 From: Stack st...@duboce.net

 To: user@hbase.apache.org

 Sent: Fri, Apr 1, 2011 12:19 pm

 Subject: Re: row_counter map reduce job  0.90.1





 On Fri, Apr 1, 2011 at 9:06 AM, Venkatesh vramanatha...@aol.com wrote:



  I'm able to run this job from the hadoop machine (where job  task tracker



 also runs)



 /hadoop jar /home/maryama/hbase-0.90.1/hbase-0.90.1.jar rowcounter 

usertable







 But, I'm not able to run the same job from



 a) hbase client machine (full hbase  hadoop installed)



 b) hbase server machines (ditto)







 Get



 File 
 /home/.../hdfs/tmp/mapred/system/job_201103311630_0024/libjars/hadoop-0.20.2-core.jar



 does not exist.











 Is that jar present on the cluster?



 St.Ack

Re: hole in META

2011-03-31 Thread Venkatesh

Yeah...excise_regions seem to work
but plug_hole does n't plug the hole..thinks the region still exists in META

May be the issue is with excise_regions.. does n't cleanly remove it..

I also tried 

/hbase org.apache.hadoop.hbase.util.Merge tbl_name region region

That does n't work for me in 0.20.6..
What are the region parameters? I tried encoded nam it did n't like..I tried 
name of the form
tbl_name,st_key,,

That did n't work either..
thanks

 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Cc: Venkatesh vramanatha...@aol.com
Sent: Thu, Mar 31, 2011 1:36 am
Subject: Re: hole in META


Be careful with those Venkatesh.  I've not looked at them in a while.

They may work for you since you are 0.20.x but please read them

carefully first before running and make sure they make sense for your

context.



St.Ack



On Wed, Mar 30, 2011 at 6:46 PM, Venkatesh vramanatha...@aol.com wrote:

 St.Ack

 I came across your script 
 https://github.com/saintstack/hbase_bin_scripts/blob/master/README

 which I find it very very usefull...I'm done running one at a time..500 

overlaps..(by check which one

 to remove from hdfs)..Still 500 or so to go..Slow but works..





 Lucas- I did n't run yours since your code depends on 0.90.x lib ..did n't 

want to risk running on 0.20.6



 thanks

 v























 -Original Message-

 From: Stack st...@duboce.net

 To: user@hbase.apache.org

 Sent: Wed, Mar 30, 2011 12:20 pm

 Subject: Re: hole in META





 Can you run a rowcount against this table or does it not complete?



 St.Ack







 On Wed, Mar 30, 2011 at 4:13 AM, Venkatesh vramanatha...@aol.com wrote:



 Yes..st.ack..overlapping.. one of them has no data..



 there are too many of them about 800 or so..



 there are some with holes too..











































 -Original Message-



 From: Stack st...@duboce.net



 To: user@hbase.apache.org



 Sent: Wed, Mar 30, 2011 1:38 am



 Subject: Re: hole in META











 What is that? Overlapping regions?  Can you try merging them with







 merge tool?  Else, study whats in hdfs.  One may have nothing in it







 (check sizes).  It might just be reference files only.  If so, lets go







 from there.  And I describe how to merge.







 St.Ack















 On Tue, Mar 29, 2011 at 9:25 PM, Venkatesh vramanatha...@aol.com wrote:







 I've regions like this... add_table.rb is unable to fix this...







 Is there anything else I could do to fix holes?















 startkey   end-key







 yv018381







    yv018381















  yv018381 .







 yv018381 .















































































 -Original Message-







 From: Stack st...@duboce.net







 To: user@hbase.apache.org







 Sent: Tue, Mar 29, 2011 12:55 pm







 Subject: Re: hole in META























 On Tue, Mar 29, 2011 at 9:09 AM, Venkatesh vramanatha...@aol.com wrote:















  I ran into missing jar with hadoop jar file when running a map



 reduce..which















 i could n't fix it..That is the only known issue with upgrade















 If I can fix that, i'll upgrade















































 Tell us more.  Whats the complaint?  Missing Guava?  Commons-logging?































 BTW, is it better to fix existing holes using add_table.rb before the







 upgrade?















 (or) upgrade takes care missing holes?















































 Yes.  Make sure all is wholesome before upgrade.  Are you able to do this?































 Good stuff V,















 St.Ack























































 =




 
=

Re: hole in META

2011-03-30 Thread Venkatesh

Yes..st.ack..overlapping.. one of them has no data..
there are too many of them about 800 or so..
there are some with holes too..

 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Wed, Mar 30, 2011 1:38 am
Subject: Re: hole in META


What is that? Overlapping regions?  Can you try merging them with

merge tool?  Else, study whats in hdfs.  One may have nothing in it

(check sizes).  It might just be reference files only.  If so, lets go

from there.  And I describe how to merge.

St.Ack



On Tue, Mar 29, 2011 at 9:25 PM, Venkatesh vramanatha...@aol.com wrote:

 I've regions like this... add_table.rb is unable to fix this...

 Is there anything else I could do to fix holes?



 startkey   end-key

 yv018381

    yv018381



  yv018381 .

 yv018381 .



















 -Original Message-

 From: Stack st...@duboce.net

 To: user@hbase.apache.org

 Sent: Tue, Mar 29, 2011 12:55 pm

 Subject: Re: hole in META





 On Tue, Mar 29, 2011 at 9:09 AM, Venkatesh vramanatha...@aol.com wrote:



  I ran into missing jar with hadoop jar file when running a map reduce..which



 i could n't fix it..That is the only known issue with upgrade



 If I can fix that, i'll upgrade











 Tell us more.  Whats the complaint?  Missing Guava?  Commons-logging?







 BTW, is it better to fix existing holes using add_table.rb before the 

upgrade?



 (or) upgrade takes care missing holes?











 Yes.  Make sure all is wholesome before upgrade.  Are you able to do this?







 Good stuff V,



 St.Ack

Re: hole in META

2011-03-30 Thread Venkatesh


 

 Thanks Lucas..I'll give it a try


 

 

-Original Message-
From: Lukas mr.bobu...@gmail.com
To: user@hbase.apache.org
Sent: Wed, Mar 30, 2011 4:19 am
Subject: Re: hole in META


Sorry for any inconvenience. This was in reply of

http://mail-archives.apache.org/mod_mbox/hbase-user/201103.mbox/%3c8cdbca99c33-1c78-9...@webmail-m083.sysops.aol.com%3e



On Wed, Mar 30, 2011 at 10:13 AM, Lukas mr.bobu...@gmail.com wrote:

 Hi there,

 It seems, that I had the same problem. AFAIK fix_table and hbck

 currently won't be able to fix this, so I wrote myself two small

 tools.

 A first one detects such loops in the meta table:

 https://gist.github.com/894031#file_h_base_region_loops.java

 If you specify '--fix', the loopy/duplicated regions are moved to a

 directory you specify and the meta is updated.

 The second one 
 (https://gist.github.com/894031#file_add_records_from_region.java)

 takes one of a moved region as input and adds its content to the

 specified table, if there isn't already an entry with the same

 family:qualifier (this fitted my needs, as I only have one entry per

 family:qualifier).

 DISCLAIMER: Those tools were programmed rather quickly, so please make

 sure, that they serve your needs.



 If you have fixed your table, I would migrate as quickly as possible to 

0.90.x!



 Best,

 Lukas

Re: hole in META

2011-03-30 Thread Venkatesh

St.Ack
I came across your script 
https://github.com/saintstack/hbase_bin_scripts/blob/master/README
which I find it very very usefull...I'm done running one at a time..500 
overlaps..(by check which one
to remove from hdfs)..Still 500 or so to go..Slow but works..


Lucas- I did n't run yours since your code depends on 0.90.x lib ..did n't 
want to risk running on 0.20.6

thanks
v


 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Wed, Mar 30, 2011 12:20 pm
Subject: Re: hole in META


Can you run a rowcount against this table or does it not complete?

St.Ack



On Wed, Mar 30, 2011 at 4:13 AM, Venkatesh vramanatha...@aol.com wrote:

 Yes..st.ack..overlapping.. one of them has no data..

 there are too many of them about 800 or so..

 there are some with holes too..





















 -Original Message-

 From: Stack st...@duboce.net

 To: user@hbase.apache.org

 Sent: Wed, Mar 30, 2011 1:38 am

 Subject: Re: hole in META





 What is that? Overlapping regions?  Can you try merging them with



 merge tool?  Else, study whats in hdfs.  One may have nothing in it



 (check sizes).  It might just be reference files only.  If so, lets go



 from there.  And I describe how to merge.



 St.Ack







 On Tue, Mar 29, 2011 at 9:25 PM, Venkatesh vramanatha...@aol.com wrote:



 I've regions like this... add_table.rb is unable to fix this...



 Is there anything else I could do to fix holes?







 startkey   end-key



 yv018381



    yv018381







  yv018381 .



 yv018381 .







































 -Original Message-



 From: Stack st...@duboce.net



 To: user@hbase.apache.org



 Sent: Tue, Mar 29, 2011 12:55 pm



 Subject: Re: hole in META











 On Tue, Mar 29, 2011 at 9:09 AM, Venkatesh vramanatha...@aol.com wrote:







  I ran into missing jar with hadoop jar file when running a map 

reduce..which







 i could n't fix it..That is the only known issue with upgrade







 If I can fix that, i'll upgrade























 Tell us more.  Whats the complaint?  Missing Guava?  Commons-logging?















 BTW, is it better to fix existing holes using add_table.rb before the



 upgrade?







 (or) upgrade takes care missing holes?























 Yes.  Make sure all is wholesome before upgrade.  Are you able to do this?















 Good stuff V,







 St.Ack


























 
=

hole in META

2011-03-29 Thread Venkatesh


 

 Hi
Using hbase-0.20.6..This has happened quite often..Is this a known issue in 
0.20.6 that
we would n't see in 0.90.1 (or) see less of?
..Attempt to fix/avoid this earlier times by truncating table, running 
add_table.rb before

What is the best way to fix this in 0.20.6? Now it's there in more tables that 
I cannot afford to lose the data
Running add_table.rb increases # of regions (which we already are way over the 
limit 25K+)

thanks
v

Re: hole in META

2011-03-29 Thread Venkatesh

Thanks. St.Ack
Yeah...I'm eager to upgrade

I had to make one small change to HBase client API to use the new version..

 I ran into missing jar with hadoop jar file when running a map reduce..which i 
could n't fix it..That is the only known issue with upgrade
If I can fix that, i'll upgrade

BTW, is it better to fix existing holes using add_table.rb before the upgrade? 
(or) upgrade takes care missing holes?

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Tue, Mar 29, 2011 11:59 am
Subject: Re: hole in META


On Tue, Mar 29, 2011 at 7:38 AM, Venkatesh vramanatha...@aol.com wrote:

 What is the best way to fix this in 0.20.6?



Move to 0.90.1 to avoid holes in .META. and to avoid losing data.  Let

us know if we can help you with upgrade.



St.Ack

Export/Import and # of regions

2011-03-29 Thread Venkatesh


 

 Hi,
If I export existing table using Export MR job, truncate the table, increase 
region size,  do a Import
will it make use of the new region size?

thanks
V

Re: Export/Import and # of regions

2011-03-29 Thread Venkatesh


 Thanks J-D..Using 0.20.6..I don't see that method with pre-split in 0.20.6 API 
spec

1) Will the data still be accessible if I Import the data to a new table? 
(purely for backup reasons)
I tried on small data set..I could..
Before I do export/Import on large table, want to make sure..

 
2) Data exported using 0.20.6, can it be imported using 0.90.1? (i could use 
pre-split in this case)


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Mar 29, 2011 5:38 pm
Subject: Re: Export/Import and # of regions


Pre-splitting was discussed a few times on the mailing list today, and

a few times in the past weeks, for example:

http://search-hadoop.com/m/XB9Vr1gQc66



Import works on a pre-existing table so it won't recreate it. Also it

doesn't know how your key space is constructed, so it cannot guess the

start/stop row keys for you.



J-D



On Tue, Mar 29, 2011 at 2:33 PM, Venkatesh vramanatha...@aol.com wrote:

 Thanks J-D



 We have way too much data   it won't fit in 1 region.Is Import smart enough 

create

 reqd # of regions?



 Cld u pl. elaborate on pre-split table creation? steps?



 Reason I'm doing this exercise is reduce # of regions in our cluster (in the 

absence of additional hardware

 25K regions on 20 node)





















 -Original Message-

 From: Jean-Daniel Cryans jdcry...@apache.org

 To: user@hbase.apache.org

 Sent: Tue, Mar 29, 2011 5:29 pm

 Subject: Re: Export/Import and # of regions





 Yes but you'll start with a single region, instead of truncating you



 probably want instead to create a pre-split table.







 J-D







 On Tue, Mar 29, 2011 at 2:27 PM, Venkatesh vramanatha...@aol.com wrote:















  Hi,



 If I export existing table using Export MR job, truncate the table, increase



 region size,  do a Import



 will it make use of the new region size?







 thanks



 V

Re: hole in META

2011-03-29 Thread Venkatesh

I've regions like this... add_table.rb is unable to fix this...
Is there anything else I could do to fix holes?

startkey   end-key
yv018381
   yv018381

 yv018381 .
yv018381 .


 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Tue, Mar 29, 2011 12:55 pm
Subject: Re: hole in META


On Tue, Mar 29, 2011 at 9:09 AM, Venkatesh vramanatha...@aol.com wrote:

  I ran into missing jar with hadoop jar file when running a map reduce..which 

i could n't fix it..That is the only known issue with upgrade

 If I can fix that, i'll upgrade





Tell us more.  Whats the complaint?  Missing Guava?  Commons-logging?



 BTW, is it better to fix existing holes using add_table.rb before the 
 upgrade? 

(or) upgrade takes care missing holes?





Yes.  Make sure all is wholesome before upgrade.  Are you able to do this?



Good stuff V,

St.Ack

java.io.FileNotFoundException:

2011-03-16 Thread Venkatesh


 Does anyone how to get around this? Trying to run a mapreduce job in a 
cluster..The one change was hbase upgraded to 0.90.1 (from 0.20.6)..No code 
change


 java.io.FileNotFoundException: File 
/data/servers/datastore/mapred/mapred/system/job_201103151601_0363/libjars/zookeeper-3.2.2.jar
 does not exist.
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at 
org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
at 
org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:629)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at 
com.aol.mail.antispam.Profiler.UserProfileJob.run(UserProfileJob.java:1916)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java

Re: java.io.FileNotFoundException:

2011-03-16 Thread Venkatesh

Thanks St.Ack..I'm blind..Got past that..
Now I get for hadoop-0.20.2-core.jar

I've removed *append*.jar all over the place  replace with 
hadoop-0.20.2-core.jar
0.90.1 will work with hadoop-0.20.2-core right? Regular gets/puts work..but not 
the mapreduce job

java.io.FileNotFoundException: File 
/data/servers/datastore/mapred/mapred/system/job_201103161652_0004/libjars/hadoop-0.20.2-core.jar
 does not exist.
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at 
org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)
at 
org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:633)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)


 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Wed, Mar 16, 2011 1:39 pm
Subject: Re: java.io.FileNotFoundException:


0.90.1 ships with zookeeper-3.3.2, not with 3.2.2.

St.Ack



On Wed, Mar 16, 2011 at 8:05 AM, Venkatesh vramanatha...@aol.com wrote:



  Does anyone how to get around this? Trying to run a mapreduce job in a 

cluster..The one change was hbase upgraded to 0.90.1 (from 0.20.6)..No code 

change





  java.io.FileNotFoundException: File 
 /data/servers/datastore/mapred/mapred/system/job_201103151601_0363/libjars/zookeeper-3.2.2.jar
  

does not exist.

at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)

at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)

at 
 org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)

at 
 org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:629)

at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)

at 
 com.aol.mail.antispam.Profiler.UserProfileJob.run(UserProfileJob.java:1916)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java

Re: java.io.FileNotFoundException:

2011-03-16 Thread Venkatesh

yeah..thats why i feel very stupid..I'm pretty sure it exists on my 
cluster..but i still get the err..
I'll try on a fresh day

 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Wed, Mar 16, 2011 7:44 pm
Subject: Re: java.io.FileNotFoundException:


The below is pretty basic error.  Reference the jar that is actually

present on your cluster.

St.Ack



On Wed, Mar 16, 2011 at 3:50 PM, Venkatesh vramanatha...@aol.com wrote:

 yeah..i was aware of that..I removed that  tried with hadoop-0.20.2-core.jar 

as I was n't ready to upgrade hadoop..



 I tried this time with the *append*.jar ..now it's complaining FileNotFound 

for append













  File 
 /data/servers/datastore/mapred/mapred/system/job_201103161750_0030/libjars/hadoop-core-0.20-append-r1056497.jar
  

does not exist.

at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)

at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)

at 
 org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)

at 
 org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:633)

at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)

at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)

at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:448)









 -Original Message-

 From: Harsh J qwertyman...@gmail.com

 To: user@hbase.apache.org

 Sent: Wed, Mar 16, 2011 5:32 pm

 Subject: Re: java.io.FileNotFoundException:





 0.90.1 ships with a hadoop-0.20-append jar (not vanilla hadoop



 0.20.2). Look up its name in the lib/ directory of the distribution



 (comes with a rev #) :)







 On Thu, Mar 17, 2011 at 2:33 AM, Venkatesh vramanatha...@aol.com wrote:



 Thanks St.Ack..I'm blind..Got past that..



 Now I get for hadoop-0.20.2-core.jar







 I've removed *append*.jar all over the place  replace with



 hadoop-0.20.2-core.jar



 0.90.1 will work with hadoop-0.20.2-core right? Regular gets/puts work..but



 not the mapreduce job







 java.io.FileNotFoundException: File 
 /data/servers/datastore/mapred/mapred/system/job_201103161652_0004/libjars/hadoop-0.20.2-core.jar



 does not exist.



at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)



at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)



at 
 org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)



at 
 org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:633)



at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)



at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)



at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)







































 -Original Message-



 From: Stack st...@duboce.net



 To: user@hbase.apache.org



 Sent: Wed, Mar 16, 2011 1:39 pm



 Subject: Re: java.io.FileNotFoundException:











 0.90.1 ships with zookeeper-3.3.2, not with 3.2.2.







 St.Ack















 On Wed, Mar 16, 2011 at 8:05 AM, Venkatesh vramanatha...@aol.com wrote:















  Does anyone how to get around this? Trying to run a mapreduce job in a







 cluster..The one change was hbase upgraded to 0.90.1 (from 0.20.6)..No code







 change























  java.io.FileNotFoundException: File 
 /data/servers/datastore/mapred/mapred/system/job_201103151601_0363/libjars/zookeeper-3.2.2.jar







 does not exist.







at 
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361)







at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)







at 
 org.apache.hadoop.filecache.DistributedCache.getTimestamp(DistributedCache.java:509)







at 
 org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:629)







at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)







at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)







at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)







at 
 com.aol.mail.antispam.Profiler.UserProfileJob.run(UserProfileJob.java:1916)







at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java























































 --



 Harsh J



 http://harshj.com

hbase 0.90.1 upgrade issue - mapreduce job

2011-03-15 Thread Venkatesh


 

 Hi
When I upgraded to 0.90.1, mapreduce fails with exception..
system/job_201103151601_0121/libjars/hbase-0.90.1.jar does not exist.

I have the jar file in classpath (hadoop-env.sh)

any ideas?
thanks

Re: region servers shutdown

2011-02-10 Thread Venkatesh

Thanks J-D..
I was quite happy in the 1st 3 months..Last month or so, lots of instabilities..

i) It's good to know that 0.90.x fixes lots of instabilities..will consider 
upgrading..It is not listed as stable production release
  hence the hesitation :)


 ii) Our cluster is 20 - node (20 data nodes + 20 region servers) (data/region 
server on every box)..besides that
   1 name node, 1 hmaster, 3 zookeper all on diff physical machines

iii) hardware pentium .., 36 gig memory on each node


 iii) Processing about 600 million events per day (real-time put) - 200 bytes 
per put. Each event is a row in a hbase table.
so 600 mill records, 1 column family, 6-10 columns

iv) About 50,000 regions so far.

v) we run map reduce job every nite that takes the 600 mil records  
updates/creates aggregate data (1 get per record)
aggregate data translates to 25 mill..x 3 puts


 
vi) region splits occur quite frequently..every 5 minutes or so

How big are the tables? 
- have n't run a count on tables lately..
- events table we keep for 90 days - 600 mill record per day..we process each 
days data
- 3 additional tables for aggregate.

How many region servers 
- 20
and how many regions do they serve? 
- 50,000 regions..x-new regions get created every day..(don't have that #)

Are you using lots of families per table? 
- No..just 1 family in all tables...# of columns  20 

Are you using LZO compression?
- NO


thanks again for your help




 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Thu, Feb 10, 2011 2:40 pm
Subject: Re: region servers shutdown


I see you are running on a very old version of hbase, and under that

you have a version of hadoop that doesn't support appends so you are

bound to have data loss on machine failure and when a region server

needs to abort like it just did.



I suggest you upgrade to 0.90.0, or even consider the release

candidate of 0.90.1 which can be found here

http://people.apache.org/~todd/hbase-0.90.1.rc0/, this is going to

help solving a lot of stability problems.



Also if you were able to reach 4097 xceivers on your datanodes, it

means that you are keeping a LOT of files opened. This suggests that

you either have a very small cluster or way too many files. Can you

tell us more about your cluster? How big are the tables? How many

region servers and how many regions do they serve? Are you using lots

of families per table? Are you using LZO compression?



Thanks for helping us helping you :)



J-D



On Thu, Feb 10, 2011 at 11:32 AM, Venkatesh vramanatha...@aol.com wrote:

 Thanks J-D..

 Can't believe i missed that..I have had it before ..i did look for it..(not 

hard/carefull enough, i guess)

 this time deflt that's the reason



 ...xceiverCount 4097 exceeds the limit of concurrent xcievers 4096...



 ..thinking of doubling this..



 I've had had so many issues in the last month..holes in meta, data node 

hung,..etc..this time it

 was enmass





















 -Original Message-

 From: Jean-Daniel Cryans jdcry...@apache.org

 To: user@hbase.apache.org

 Sent: Thu, Feb 10, 2011 1:56 pm

 Subject: Re: region servers shutdown





 The first thing to do would be to look at the datanode logs a the time



 of the outage. Very often it's caused by either ulimit or xcievers



 that weren't properly configured, checkout



 http://hbase.apache.org/notsoquick.html#ulimit







 J-D







 On Thu, Feb 10, 2011 at 10:42 AM, Venkatesh vramanatha...@aol.com wrote:















  Hi



 I've had this before ..but not to 70% of the cluster..region servers all



 dying..Any insight is helpful.



 Using hbase-0.20.6, hadoop-0.20.2



 I don't see any error in the datanode or the namenode



 many thanks











 Here's the relevant log entires







 ..in master...



 Got while writing region XXlog java.io.IOException: Bad connect ack with



 firstBadLink YYY







 2011-02-10 01:31:26,052 DEBUG org.apache.hadoop.hbase.regionserver.HLog:



 Waiting for hlog writers to terminate, iteration #9



 2011-02-10 01:31:28,974 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer



 Exception: java.io.IOException: Unable to create new block.



at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)



at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)



at 
 org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)







 2011-02-10 01:31:28,975 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery



 for block blk_1053173551314261780_21097871 bad datanode[2] nodes == null



 2011-02-10 01:31:28,975 WARN org.apache.hadoop.hdfs.DFSClient: Could not get



 block locations. Source file /hbase_data//1560386868/oldlogfile.log -



 Aborting...











 in region server..(one of them)







 2011-02-10 01:29:41,028 WARN org.apache.hadoop.hdfs.DFSClient

Re: region servers shutdown

2011-02-10 Thread Venkatesh

Keys are randomized..Requests are nearly equally distributed across region 
servers (use to have sequential but that created
region hot spots)..However current scheme requires our map reduce job to look 
for events in all regions (using hbase time stamp)..
which hurts the map-reduce performance..but did help the real puts

 

 


 

 

-Original Message-
From: Ted Dunning tdunn...@maprtech.com
To: user@hbase.apache.org
Sent: Thu, Feb 10, 2011 3:45 pm
Subject: Re: region servers shutdown


Are your keys sequential or randomized?



On Thu, Feb 10, 2011 at 12:35 PM, Venkatesh vramanatha...@aol.com wrote:



  iii) Processing about 600 million events per day (real-time put) - 200

 bytes per put. Each event is a row in a hbase table.

 so 600 mill records, 1 column family, 6-10 columns



 iv) About 50,000 regions so far.



 v) we run map reduce job every nite that takes the 600 mil records 

 updates/creates aggregate data (1 get per record)

 aggregate data translates to 25 mill..x 3 puts

Re: region servers shutdown

2011-02-10 Thread Venkatesh

Thanks J-D
will increase MAX_FILESIZE as u suggested...I could truncate one of the tables 
which constitutes
80% of the regions

will try compression after that

 

 


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Thu, Feb 10, 2011 4:37 pm
Subject: Re: region servers shutdown


2500 regions per region server can be a lot of files to keep opened,

which is probably one of the main reason for your instability (as your

regions were growing, it started poking into those dark corners of

xcievers and eventually ulimits).



You need to set your regions to be bigger, and use LZO compression to

even lower the cost of storing those events and at the same time

improve performance across the board. Check the MAX_FILESIZE config

for your table in the shell, I would recommend 1GB instead of the

default 256MB.



Then, follow this wiki to setup LZO:

http://wiki.apache.org/hadoop/UsingLzoCompression



Finally, you cannot merge regions (like it was said in other threads

this week) to bring the count back down, so one option you might

consider is copying all the content from the first table to a second

better configured table. It's probably going to be a pain to do it in

0.20.6 because you cannot create a table with multiple regions, so

maybe that would be another reason to upgrade :)



Oh and one other thing, if your zk servers are of the same class of

hardware as the region servers and you're not using them for anything

else than HBase, then you should only use 1 zk server and collocate it

with the master and the namenode, then use those 3 machines as region

servers to help spread the region load.



J-D



On Thu, Feb 10, 2011 at 12:35 PM, Venkatesh vramanatha...@aol.com wrote:

 Thanks J-D..

 I was quite happy in the 1st 3 months..Last month or so, lots of 

instabilities..



 i) It's good to know that 0.90.x fixes lots of instabilities..will consider 

upgrading..It is not listed as stable production release

  hence the hesitation :)





  ii) Our cluster is 20 - node (20 data nodes + 20 region servers) 
 (data/region 

server on every box)..besides that

   1 name node, 1 hmaster, 3 zookeper all on diff physical machines



 iii) hardware pentium .., 36 gig memory on each node





  iii) Processing about 600 million events per day (real-time put) - 200 bytes 

per put. Each event is a row in a hbase table.

 so 600 mill records, 1 column family, 6-10 columns



 iv) About 50,000 regions so far.



 v) we run map reduce job every nite that takes the 600 mil records  

updates/creates aggregate data (1 get per record)

 aggregate data translates to 25 mill..x 3 puts







 vi) region splits occur quite frequently..every 5 minutes or so



 How big are the tables?

 - have n't run a count on tables lately..

 - events table we keep for 90 days - 600 mill record per day..we process each 

days data

 - 3 additional tables for aggregate.



 How many region servers

 - 20

 and how many regions do they serve?

 - 50,000 regions..x-new regions get created every day..(don't have that #)



 Are you using lots of families per table?

 - No..just 1 family in all tables...# of columns  20



 Are you using LZO compression?

 - NO





 thanks again for your help













 -Original Message-

 From: Jean-Daniel Cryans jdcry...@apache.org

 To: user@hbase.apache.org

 Sent: Thu, Feb 10, 2011 2:40 pm

 Subject: Re: region servers shutdown





 I see you are running on a very old version of hbase, and under that



 you have a version of hadoop that doesn't support appends so you are



 bound to have data loss on machine failure and when a region server



 needs to abort like it just did.







 I suggest you upgrade to 0.90.0, or even consider the release



 candidate of 0.90.1 which can be found here



 http://people.apache.org/~todd/hbase-0.90.1.rc0/, this is going to



 help solving a lot of stability problems.







 Also if you were able to reach 4097 xceivers on your datanodes, it



 means that you are keeping a LOT of files opened. This suggests that



 you either have a very small cluster or way too many files. Can you



 tell us more about your cluster? How big are the tables? How many



 region servers and how many regions do they serve? Are you using lots



 of families per table? Are you using LZO compression?







 Thanks for helping us helping you :)







 J-D







 On Thu, Feb 10, 2011 at 11:32 AM, Venkatesh vramanatha...@aol.com wrote:



 Thanks J-D..



 Can't believe i missed that..I have had it before ..i did look for it..(not



 hard/carefull enough, i guess)



 this time deflt that's the reason







 ...xceiverCount 4097 exceeds the limit of concurrent xcievers 4096...







 ..thinking of doubling this..







 I've had had so many issues in the last month..holes in meta, data node



 hung,..etc..this time it



 was enmass











































 -Original

script to delete regions with no rows

2011-01-28 Thread Venkatesh


 

 Is there a script?
thanks

Re: script to delete regions with no rows

2011-01-28 Thread Venkatesh

thankyou
 

 


 

 

-Original Message-
From: Stack st...@duboce.net
To: user@hbase.apache.org
Sent: Fri, Jan 28, 2011 3:43 pm
Subject: Re: script to delete regions with no rows


The end key of one region must match the start key of the next so you

can't just remove the region from .META. and its directory -- if one

-- in HDFS. You'd need to adjust the start or end key on the region

previous or after to include the scope of the just removed region.



There is no script to do this that I know of.   Check the content of

bin/*.rb.  These scripts mess around with meta adding and removing

regions.  They might inspire. Also look at the Merge.java class.  See

how it edits .META. after merging two adjacent regions to create a new

region that spans the key space of the two old adjacent regions.



St.Ack







On Fri, Jan 28, 2011 at 12:29 PM, Venkatesh vramanatha...@aol.com wrote:







  Is there a script?

 thanks

Re: getting retries exhausted exception

2011-01-26 Thread Venkatesh


Thanks J-D
I do see this in region server log

2011-01-26 03:03:24,459 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server 
Responder, call next(5800409546372591083, 1000) from 172.29.253.231:35656: 
output error
2011-01-26 03:03:24,462 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server 
handler 256 on 60020 caught: java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
at org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1164)
at 
org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1125)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:615)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:679)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:943)

 

 ...


 
2011-01-26 03:04:17,961 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
org.apache.hadoop.hbase.UnknownScannerException: Scanner was closed (timed 
out?) after we renewed it. Could be caused by a very slow scanner or a lengthy 
garbage collection
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1865)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1897)
at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2011-01-26 03:04:17,966 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 5800409546372591083 
lease expired
2011-01-26 03:04:17,966 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner 4439572834176684295 
lease expired


 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Wed, Jan 26, 2011 5:26 pm
Subject: Re: getting retries exhausted exception


It seems to be coming from the region server side... so one thing you

can check is the region server logs and see if the NPEs are there. If

not, and there's nothing suspicious, then consider enabling DEBUG for

hbase and re-run the job to hopefully get more information.



J-D



On Wed, Jan 26, 2011 at 8:44 AM, Venkatesh vramanatha...@aol.com wrote:







  Using 0.20.6..any solutions? Occurs during mapper phase..will increasing 

retry count fix this?

 thanks



 here's the stack trace



 org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 

region server null for region , row '', but failed after 10 attempts.



 Exceptions:



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException



 java.lang.NullPointerException







at 
 org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1045)



at 
 org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:2003)



at 
 org.apache.hadoop.hbase.client.HTable$ClientScanner.initialize(HTable.java:1923)



at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:403)



at 
 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.restart(TableInputFormatBase.java:110)



at 
 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$TableRecordReader.nextKeyValue(TableInputFormatBase.java:210)



at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)



at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)



at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)



at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)



at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)



at org.apache.hadoop.mapred.Child.main(Child.java:170)

Re: hbase.client.retries.number

2010-10-15 Thread Venkatesh


 Hi Sean:
Thx
Size of column family is very small  100 bytes

 Investigating potential bottleneck spot..Our cluster is small (relatively 
speaking)..10 node
Our hardware is high end (not commodity)

venkatesh


 

 

-Original Message-
From: Sean Bigdatafun sean.bigdata...@gmail.com
To: user@hbase.apache.org
Sent: Fri, Oct 15, 2010 5:28 pm
Subject: Re: hbase.client.retries.number


On Thu, Oct 14, 2010 at 12:03 PM, Venkatesh vramanatha...@aol.com wrote:


  Thanks J-D

 Yeah..Found out the hard way in prod :) set to zero..since client requests
 were backing up..
 everything stopped working/region server would n't come up..etc..(did not
 realize hbase
 client property would be used by server :)

 I reverted all retries back to default..

 So far everything seems good...(fingers crossed).after making several
 tunables along the way..

 - Using HBase 0.20.6

 -Processing about 300 million event puts
 -85% of requests are under 10 milli.sec..while the mean is about 300
 millisecs..Trying to narrow
  that..if it's during our client GC or Hbase pause..Tuning region server
 handler count

 This is way slow too.


 -mapreduce job to process 40 million records takes about an hour..Majority
 in the reduce phase.
  Trying to optimize that..by varying buffer size of writes..Going to try
 the in_memory option as well.

 This is way slow too.


 - Full table scan takes about 30 minutes..Is that reasonable for a table
 size of  10 mill records?
  hbase.client.scanner.caching - If set in hbase-site.xml, Scan calls should
 pick that up correct?

This is way slow for a 10 million records table. What size is your column
family?



 thanks
 venkatesh










 -Original Message-
 From: Jean-Daniel Cryans jdcry...@apache.org
 To: user@hbase.apache.org
  Sent: Thu, Oct 14, 2010 2:39 pm
 Subject: Re: hbase.client.retries.number


 hbase.client.retries.number is used by HConnectionManager, so this
 means anything that uses the HBase client. I think some parts of the
 region server code use it, or used it at some point, I'd have to dig
 in. But definitely never set this to 0, as any region move/split will
 kill your client,

 About this RetriesExhaustedException, it seems that either the region
 is in an unknown state or that it just took a lot of time to close
 and be moved. You need to correlate this with the master log (look for
 this region's name) since the client cannot possibly know what went on
 inside the cluster.

 Also, which version are you using?

 J-D

 On Mon, Oct 11, 2010 at 3:06 PM, Venkatesh vramanatha...@aol.com wrote:
 
   BTW..get this exception while trying a new put.. Also, get this
 exception on
 gets on some region servers
 
 
 
  org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to
 contact
 region server Some server, retryOnlyOne=true, index=0, islastrow=true,
 tries=9,
 numtries=10, i=0, listsize=1,
 region=user_activity,1286789413060_atanackovics_30306_4a3e0812,1286789581757
 for region user_activity,1286789413060_30306_4a3e0812,1286789581757, row
 '1286823659253_v6_1_df34b22f', but failed after 10 attempts.
  Exceptions:
 
 
  
 org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1149)
 
  
 org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230)
 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
 org.apache.hadoop.hbase.client.HTable.close(HTable.java:682)
 
  
 com.aol.mail.antispam.Profiler.notifyEmailSendActivity.processGetRequest(notifyEmailSendActivity.java:363)
 
  
 com.aol.mail.antispam.Profiler.notifyEmailSendActivity.doGet(notifyEmailSendActivity.java:450)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
 
 
 
 
 
 
 
  -Original Message-
  From: Venkatesh vramanatha...@aol.com
  To: user@hbase.apache.org
  Sent: Mon, Oct 11, 2010 2:35 pm
  Subject: hbase.client.retries.number
 
 
 
 
 
   HBase was seamless for first couple of weeks..now all kinds of issues in
  production :) fun fun..
  Curious ..does this property have to match up on hbase client side 
 region
  server side..
 
  I've this number set to 0 on region server side  default on client
 side..
  I can't do any put (new)
 
  thanks
  venkatesh

Re: hbase.client.retries.number

2010-10-14 Thread Venkatesh


 Thanks J-D

Yeah..Found out the hard way in prod :) set to zero..since client requests were 
backing up..
everything stopped working/region server would n't come up..etc..(did not 
realize hbase
client property would be used by server :)

I reverted all retries back to default..

So far everything seems good...(fingers crossed).after making several tunables 
along the way..

- Using HBase 0.20.6

-Processing about 300 million event puts
-85% of requests are under 10 milli.sec..while the mean is about 300 
millisecs..Trying to narrow
 that..if it's during our client GC or Hbase pause..Tuning region server 
handler count

-mapreduce job to process 40 million records takes about an hour..Majority in 
the reduce phase. 
 Trying to optimize that..by varying buffer size of writes..Going to try the 
in_memory option as well.

- Full table scan takes about 30 minutes..Is that reasonable for a table size 
of  10 mill records?
  hbase.client.scanner.caching - If set in hbase-site.xml, Scan calls should 
pick that up correct?

thanks
venkatesh
  


 


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Thu, Oct 14, 2010 2:39 pm
Subject: Re: hbase.client.retries.number


hbase.client.retries.number is used by HConnectionManager, so this
means anything that uses the HBase client. I think some parts of the
region server code use it, or used it at some point, I'd have to dig
in. But definitely never set this to 0, as any region move/split will
kill your client,

About this RetriesExhaustedException, it seems that either the region
is in an unknown state or that it just took a lot of time to close
and be moved. You need to correlate this with the master log (look for
this region's name) since the client cannot possibly know what went on
inside the cluster.

Also, which version are you using?

J-D

On Mon, Oct 11, 2010 at 3:06 PM, Venkatesh vramanatha...@aol.com wrote:

  BTW..get this exception while trying a new put.. Also, get this exception 
 on 
gets on some region servers



 org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, 
numtries=10, i=0, listsize=1, 
region=user_activity,1286789413060_atanackovics_30306_4a3e0812,1286789581757 
for region user_activity,1286789413060_30306_4a3e0812,1286789581757, row 
'1286823659253_v6_1_df34b22f', but failed after 10 attempts.
 Exceptions:


 org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1149)

 org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230)
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
org.apache.hadoop.hbase.client.HTable.close(HTable.java:682)

 com.aol.mail.antispam.Profiler.notifyEmailSendActivity.processGetRequest(notifyEmailSendActivity.java:363)

 com.aol.mail.antispam.Profiler.notifyEmailSendActivity.doGet(notifyEmailSendActivity.java:450)
javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
javax.servlet.http.HttpServlet.service(HttpServlet.java:717)







 -Original Message-
 From: Venkatesh vramanatha...@aol.com
 To: user@hbase.apache.org
 Sent: Mon, Oct 11, 2010 2:35 pm
 Subject: hbase.client.retries.number





  HBase was seamless for first couple of weeks..now all kinds of issues in
 production :) fun fun..
 Curious ..does this property have to match up on hbase client side  region
 server side..

 I've this number set to 0 on region server side  default on client side..
 I can't do any put (new)

 thanks
 venkatesh

Re: Increase region server throughput

2010-10-14 Thread Venkatesh


 Thanks St.Ack.

I've both those settings autoflush  writebuffer size.
I'll try the new HTable(conf, ..)..(I just have new HTable(table) now)

Right now upto 85% under 10ms..I'm trying to bring the mean down

PS: I can tolerate some loss of data for getting better throughput.

 


 

 

-Original Message-
From: Sean Bigdatafun sean.bigdata...@gmail.com
To: user@hbase.apache.org
Sent: Thu, Oct 14, 2010 8:11 pm
Subject: Re: Increase region server throughput


Though this setup, setautoflush(false), increases the thoughput, the data
loss rate increases significantly -- there is no way for the client to know
what has been lost and what has gone through. That bothers me.

Sean

On Tue, Oct 12, 2010 at 11:32 AM, Stack st...@duboce.net wrote:

 Have you played with these settings HTable API?


 http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/client/HTable.html#setAutoFlush(boolean)

 http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/client/HTable.html#setWriteBufferSize(long)

 There is something seriously wrong if you are seeing 5 seconds per put
 (unless your put is gigabytes in size?).

 Are you doing 'new HTable(tablename)' in your client or are you doing
 'new HTable(conf, tablename)' in your client code?  Do the latter if
 not -- share the configuration with HTable instances.

 St.Ack

 On Mon, Oct 11, 2010 at 10:47 PM, Venkatesh vramanatha...@aol.com wrote:
 
 
 
   I would like to tune region server to increase throughput..On a 10 node
 cluster,
  I'm getting 5 sec per put. (this is unbatched/unbuffered). Other than
  region server handler count property is there anything else I can tune
  to increase throughput? ( this operation i can't use buffered write
 without
  code change)
 
  thx
  venkatesh

hbase.client.retries.number

2010-10-11 Thread Venkatesh


 

 HBase was seamless for first couple of weeks..now all kinds of issues in 
production :) fun fun..
Curious ..does this property have to match up on hbase client side  region 
server side..

I've this number set to 0 on region server side  default on client side..
I can't do any put (new)

thanks
venkatesh

Re: hbase.client.retries.number

2010-10-11 Thread Venkatesh


 BTW..get this exception while trying a new put.. Also, get this exception on 
gets on some region servers


 
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
region server Some server, retryOnlyOne=true, index=0, islastrow=true, tries=9, 
numtries=10, i=0, listsize=1, 
region=user_activity,1286789413060_atanackovics_30306_4a3e0812,1286789581757 
for region user_activity,1286789413060_30306_4a3e0812,1286789581757, row 
'1286823659253_v6_1_df34b22f', but failed after 10 attempts.
Exceptions:


org.apache.hadoop.hbase.client.HConnectionManager$TableServers$Batch.process(HConnectionManager.java:1149)

org.apache.hadoop.hbase.client.HConnectionManager$TableServers.processBatchOfRows(HConnectionManager.java:1230)
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:666)
org.apache.hadoop.hbase.client.HTable.close(HTable.java:682)

com.aol.mail.antispam.Profiler.notifyEmailSendActivity.processGetRequest(notifyEmailSendActivity.java:363)

com.aol.mail.antispam.Profiler.notifyEmailSendActivity.doGet(notifyEmailSendActivity.java:450)
javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
javax.servlet.http.HttpServlet.service(HttpServlet.java:717)



 

 

-Original Message-
From: Venkatesh vramanatha...@aol.com
To: user@hbase.apache.org
Sent: Mon, Oct 11, 2010 2:35 pm
Subject: hbase.client.retries.number



 

 HBase was seamless for first couple of weeks..now all kinds of issues in 
production :) fun fun..
Curious ..does this property have to match up on hbase client side  region 
server side..

I've this number set to 0 on region server side  default on client side..
I can't do any put (new)

thanks
venkatesh

Increase region server throughput

2010-10-11 Thread Venkatesh


 

 I would like to tune region server to increase throughput..On a 10 node 
cluster,
I'm getting 5 sec per put. (this is unbatched/unbuffered). Other than
region server handler count property is there anything else I can tune
to increase throughput? ( this operation i can't use buffered write without
code change)

thx
venkatesh

Region servers suddenly disappearing

2010-10-10 Thread Venkatesh


 Some of the region servers suddenly dying..I've pasted relevant log lines..I 
don't see any error in datanodes
Any ideas?
thanks
venkatsh

.


 2010-10-10 12:55:36,664 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer 
Exception: java.io.IOException: Unable to create new block.at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)

2010-10-10 12:55:36,664 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery 
for block blk_-8758558338582893960_95415 bad datanode[0] nodes == null
2010-10-10 12:55:36,665 WARN org.apache.hadoop.hdfs.DFSClient: Could not get 
block locations. Source file 
/hbase_data/user_activity/compaction.dir/78194102/766401078063435042 - 
Aborting...
2010-10-10 12:55:36,666 ERROR 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/Split 
failed for region user_activity,1286729575294_11655_614aa74e,1286729678877
java.io.EOFException
at java.io.DataInputStream.readByte(DataInputStream.java:250)at 
org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)
at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:319)  
  at org.apache.hadoop.io.Text.readString(Text.java:400)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.createBlockOutputStream(DFSClient.java:2901)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2826)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288)
2010-10-10 12:55:40,176 INFO org.apache.hadoop.hdfs.DFSClient: Exception in 
createBlockOutputStream java.io.EOFException
2010-10-10 12:55:40,176 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block 
blk_-568910271688144725_95415



2010-10-10 12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
closed activities2010-10-10 12:55:53,353 INFO 
org.apache.hadoop.hbase.regionserver.HRegion: Closed 
user_activity,1286232613677_albridgew4_18363_c45677e1,12862335110072010-10-10 
12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: closing 
region 
user_activity,1286202422485_bayequip_15725_a6b7893e,12862031448812010-10-10 
12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing 
user_activity,1286202422485_bayequip_15725_a6b7893e,1286203144881: disabling 
compactions  flushes2010-10-10 12:55:53,353 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region, no 
outstanding scanners on 
user_activity,1286202422485_bayequip_15725_a6b7893e,12862031448812010-10-10 
12:55:53,353 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row 
locks outstanding on region 
user_activity,1286202422485_15725_a6b7893e,12862031448812010-10-10 12:55:53,353 
DEBUG org.apache.hadoop.hbase.regionserver.Store: closed activities2010-10-10 
12:55:53,354 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed 
user_activity,1286202422485_bayequip_15725_a6b7893e,1286203144881
2010-10-10 12:55:53,354 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: aborting server at: 
172.29.253.200:60020
2010-10-10 12:55:55,091 INFO org.apache.hadoop.hbase.Leases: 
regionserver/172.29.253.200:60020.leaseChecker closing leases2010-10-10 
12:55:55,091 INFO org.apache.hadoop.hbase.Leases: 
regionserver/172.29.253.200:60020.leaseChecker closed leases
2010-10-10 12:55:59,664 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2010-10-10 12:55:59,664 INFO org.apache.zookeeper.ZooKeeper: Closing session: 
0x22b967dce5d0001
2010-10-10 12:55:59,664 INFO org.apache.zookeeper.ClientCnxn: Closing 
ClientCnxn for session: 0x22b967dce5d0001
2010-10-10 12:55:59,669 INFO org.apache.zookeeper.ClientCnxn: Exception while 
closing send thread for session 0x22b967dce5d0001 : Read error rc = -1 
java.nio.DirectByteBuffer[pos=0 lim=4 cap=4]
2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ClientCnxn: Disconnecting 
ClientCnxn for session: 0x22b967dce5d0001
2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ZooKeeper: Session: 
0x22b967dce5d0001 closed
2010-10-10 12:55:59,775 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Closed connection with 
ZooKeeper
2010-10-10 12:55:59,775 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down
2010-10-10 12:55:59,776 ERROR org.apache.hadoop.hdfs.DFSClient: Exception 
closing file /hbase_data/user_activity/78194102/activities/8044918410206348854 
: java.io.EOFException
java.io.EOFExceptionat 
java.io.DataInputStream.readByte(DataInputStream.java:250)at 
org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:298)

Incredibly slow response to Scan

2010-10-07 Thread Venkatesh


 

 J-D et.al
I've put the mapreduce issue that I had in the back burner for now.
I'm getting incredible slow response to Scan..

On a 10 node cluster for a table with 1200 regions ..it takes 20 minutes
to scan a column with given value..Got 100 or so records for the response..

Is this normal?

thanks
venkatesh
PS. setCaching(100) ..did n't make a dent in performance

Re: HBase map reduce job timing

2010-10-06 Thread Venkatesh


 Ahh ..ok..That makes sense

I've a 10 node cluster each with 36 gig..I've allocated 4gig for HBase Region 
Servers..master.jsp
reports used heap is less than half on each region server.


 I've close to 800 regions total..Guess it needs to kick off a jvm to see if 
data exists
in all regions..


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Oct 5, 2010 11:52 pm
Subject: Re: HBase map reduce job timing


 Regarding number of map tasks 500+, 490 of them processing nothing, do you 
have an explanation
 for that?..Wondering if its kicking off too many JVMs most doing nothing..

This would mean that throughout your regions, only a few have data in
the timestamp range you're looking for.


 'top' reports less free memory (couple of gig.) though box has 36 gig total.. 
I don't quite trust
 top since cached blocks don't show up under free column even if no process is 
running..


You only have 1 machine?

BTW how much RAM did you give to HBase?

J-D

Re: HBase map reduce job timing

2010-10-06 Thread Venkatesh


 Also, do you think if I query using rowkey instead of hbase time stamp..it 
would not kick off that many tasks..
since region server knows the exact locations?

thanks
venkatesh

 


 

 

-Original Message-
From: Venkatesh vramanatha...@aol.com
To: user@hbase.apache.org
Sent: Wed, Oct 6, 2010 8:53 am
Subject: Re: HBase map reduce job timing


 Ahh ..ok..That makes sense

I've a 10 node cluster each with 36 gig..I've allocated 4gig for HBase Region 
Servers..master.jsp
reports used heap is less than half on each region server.


 I've close to 800 regions total..Guess it needs to kick off a jvm to see if 
data exists
in all regions..


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Oct 5, 2010 11:52 pm
Subject: Re: HBase map reduce job timing


 Regarding number of map tasks 500+, 490 of them processing nothing, do you 

have an explanation

 for that?..Wondering if its kicking off too many JVMs most doing nothing..



This would mean that throughout your regions, only a few have data in

the timestamp range you're looking for.





 'top' reports less free memory (couple of gig.) though box has 36 gig total.. 

I don't quite trust

 top since cached blocks don't show up under free column even if no process is 

running..





You only have 1 machine?



BTW how much RAM did you give to HBase?



J-D


 
=

Re: HBase map reduce job timing

2010-10-06 Thread Venkatesh


 

 Thanks J-D
I'll hookup Ganglia (wanting but kept pushing back..)  get back
V


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Wed, Oct 6, 2010 12:22 pm
Subject: Re: HBase map reduce job timing


  Also, do you think if I query using rowkey instead of hbase time stamp..it 
would not kick off that many tasks..
 since region server knows the exact locations?

I don't see how you could do that in a scalable way, unless you really
have to query a few rows (less than a million).




 I've a 10 node cluster each with 36 gig..I've allocated 4gig for HBase Region 
Servers..master.jsp
 reports used heap is less than half on each region server.


This is Java so the reported heap doesn't mean much... the garbage
collector doesn't collect aggressively since that would be awfully
inefficient.


  I've close to 800 regions total..Guess it needs to kick off a jvm to see if 
data exists
 in all regions..


It does, and like you said the mappers take only a few minutes so
optimizing that part of the job is useless until you get your reducers
faster.

So regarding the speed of inserts (this seem to be the real issue if
what you said about the write buffer is true), I'd be interested in 1)
seeing your reducer's code (strip whatever you have that's business
specific) and 2) seeing some monitoring data while the job is running
(if not, get ganglia in there). Inserts could be slow for many reasons
apart from bad API usage, such as cluster misconfiguration,
sub-optimal insertion pattern (the classic being having only 1
region), etc.

J-D

HBase map reduce job timing

2010-10-05 Thread Venkatesh


 

 I've a mapreduce job that is taking too long..over an hour..Trying to see what 
can a tune
to to bring it down..One thing I noticed, the job is kicking off
- 500+ map tasks : 490 of them do not process any records..where as 10 of them 
process all the records
 (200 K each..)..Any idea why that would be?...
 
..map phase takes about couple of minutes..
..reduce phase takes the rest..

..i'll try increasing # of reduce tasks..Open to other other suggestion for 
tunables..

thanks for your input
venkatesh

Re: HBase map reduce job timing

2010-10-05 Thread Venkatesh


 

 Sorry..yeah..i've to do some digging to provide some data..
What sort of data would be helpful? Would stats reported by jobtracker.jsp 
suffice? I've pasted that in this email..
I can gather more jvm stats..thanks

Status: Succeeded
Started at: Tue Oct 05 21:39:58 EDT 2010
Finished at: Tue Oct 05 22:36:43 EDT 2010
Finished in: 56mins, 45sec
Job Cleanup: Successful



Kind
% Complete
Num Tasks
Pending
Running
Complete
Killed
Failed/Killed
Task Attempts

map
100.00%





565
0
0
565
0
0 / 11

reduce
100.00%





20
0
0
20
0
0 / 2



  


  
Counter
  
Map
  
Reduce
  
Total


  
Job Counters 
  
Launched reduce tasks
  
0
  
0
  
22



Rack-local map tasks
  
0
  
0
  
66



Launched map tasks
  
0
  
0
  
576



Data-local map tasks
  
0
  
0
  
510


  
com.JobRecords
  
REDUCE_PHASE_RECORDS
  
0
  
597,712
  
597,712



MAP_PHASE_RECORDS
  
2,534,807
  
0
  
2,534,807


  
FileSystemCounters
  
FILE_BYTES_READ
  
335,845,726
  
861,146,518
  
1,196,992,244



FILE_BYTES_WRITTEN
  
1,197,031,156
  
861,146,518
  
2,058,177,674


  
Map-Reduce Framework
  
Reduce input groups
  
0
  
597,712
  
597,712



Combine output records
  
0
  
0
  
0



Map input records
  
2,534,807
  
0
  
2,534,807



Reduce shuffle bytes
  
0
  
789,145,342
  
789,145,342



Reduce output records
  
0
  
0
  
0



Spilled Records
  
3,522,428
  
2,534,807
  
6,057,235



Map output bytes
  
851,007,170
  
0
  
851,007,170



Map output records
  
2,534,807
  
0
  
2,534,807



Combine input records
  
0
  
0
  
0



Reduce input records
  
0
  
2,534,807
  
2,534,807




 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Oct 5, 2010 10:53 pm
Subject: Re: HBase map reduce job timing


I'd love to give you tips, but you didn't provide any data about the
input and output of your job, the kind of hardware you're using, etc.
At this point any suggestion would be a stab in the dark, the best I
can do is pointing to the existing documentation
http://wiki.apache.org/hadoop/PerformanceTuning

J-D

On Tue, Oct 5, 2010 at 7:12 PM, Venkatesh vramanatha...@aol.com wrote:



  I've a mapreduce job that is taking too long..over an hour..Trying to see 
what can a tune
 to to bring it down..One thing I noticed, the job is kicking off
 - 500+ map tasks : 490 of them do not process any records..where as 10 of 
 them 
process all the records
  (200 K each..)..Any idea why that would be?...

 ..map phase takes about couple of minutes..
 ..reduce phase takes the rest..

 ..i'll try increasing # of reduce tasks..Open to other other suggestion for 
tunables..

 thanks for your input
 venkatesh

Re: HBase map reduce job timing

2010-10-05 Thread Venkatesh


 Sure..Both input  output are HBase tables
Input (mapper phase) - scanning a HBase table for all records within time range 
(using hbase timestamps)
Output (reduce phase) - doing a Put to 3 different HBase tables



-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Tue, Oct 5, 2010 11:14 pm
Subject: Re: HBase map reduce job timing


It'd be more useful if we knew where that data is coming from, and
where it's going. Are you scanning HBase and/or writing to it?

J-D

On Tue, Oct 5, 2010 at 8:05 PM, Venkatesh vramanatha...@aol.com wrote:



  Sorry..yeah..i've to do some digging to provide some data..
 What sort of data would be helpful? Would stats reported by jobtracker.jsp 
suffice? I've pasted that in this email..
 I can gather more jvm stats..thanks

 Status: Succeeded
 Started at: Tue Oct 05 21:39:58 EDT 2010
 Finished at: Tue Oct 05 22:36:43 EDT 2010
 Finished in: 56mins, 45sec
 Job Cleanup: Successful



 Kind
 % Complete
 Num Tasks
 Pending
 Running
 Complete
 Killed
 Failed/Killed
 Task Attempts

 map
 100.00%





 565
 0
 0
 565
 0
 0 / 11

 reduce
 100.00%





 20
 0
 0
 20
 0
 0 / 2







 Counter

 Map

 Reduce

 Total



 Job Counters

 Launched reduce tasks

 0

 0

 22



 Rack-local map tasks

 0

 0

 66



 Launched map tasks

 0

 0

 576



 Data-local map tasks

 0

 0

 510



 com.JobRecords

 REDUCE_PHASE_RECORDS

 0

 597,712

 597,712



 MAP_PHASE_RECORDS

 2,534,807

 0

 2,534,807



 FileSystemCounters

 FILE_BYTES_READ

 335,845,726

 861,146,518

 1,196,992,244



 FILE_BYTES_WRITTEN

 1,197,031,156

 861,146,518

 2,058,177,674



 Map-Reduce Framework

 Reduce input groups

 0

 597,712

 597,712



 Combine output records

 0

 0

 0



 Map input records

 2,534,807

 0

 2,534,807



 Reduce shuffle bytes

 0

 789,145,342

 789,145,342



 Reduce output records

 0

 0

 0



 Spilled Records

 3,522,428

 2,534,807

 6,057,235



 Map output bytes

 851,007,170

 0

 851,007,170



 Map output records

 2,534,807

 0

 2,534,807



 Combine input records

 0

 0

 0



 Reduce input records

 0

 2,534,807

 2,534,807








 -Original Message-
 From: Jean-Daniel Cryans jdcry...@apache.org
 To: user@hbase.apache.org
 Sent: Tue, Oct 5, 2010 10:53 pm
 Subject: Re: HBase map reduce job timing


 I'd love to give you tips, but you didn't provide any data about the
 input and output of your job, the kind of hardware you're using, etc.
 At this point any suggestion would be a stab in the dark, the best I
 can do is pointing to the existing documentation
 http://wiki.apache.org/hadoop/PerformanceTuning

 J-D

 On Tue, Oct 5, 2010 at 7:12 PM, Venkatesh vramanatha...@aol.com wrote:



  I've a mapreduce job that is taking too long..over an hour..Trying to see
 what can a tune
 to to bring it down..One thing I noticed, the job is kicking off
 - 500+ map tasks : 490 of them do not process any records..where as 10 of 
them
 process all the records
  (200 K each..)..Any idea why that would be?...

 ..map phase takes about couple of minutes..
 ..reduce phase takes the rest..

 ..i'll try increasing # of reduce tasks..Open to other other suggestion for
 tunables..

 thanks for your input
 venkatesh

hbase/hdfs disk usage

2010-09-30 Thread Venkatesh


 Hi
We have cluster up  running in production..Firstly, thanks to J-D  the group 
for all the initial input/help.
I'm trying to handle of DFS disk usage. All I'm storing in hdfs is hbase 
records.

If i do hadoop du /hbase_data dir ..i see 1 gig so far..where as DFS usage via 
the web interface..reports 10gig
That seems odd..Any ideas on what could be taking up space?..I don't have 
permission to look the entire hdfs..yet

Just thought i'll ask the group
thanks
venkatesh

Re: stop-hbase.sh takes forever (never stops)

2010-09-07 Thread Venkatesh


 Don't know if this helps..but here are couple of reasons when I had the issue 
 how i resolved it
- If zookeeper is not running (or do not have the quorum) in a cluster setup, 
hbase does not go down..bring up zookeeper
- Make sure pid file is not under /tmp...somtimes files get cleaned out of 
tmp..Change *env.sh to point to diff dir.


 


 

 

-Original Message-
From: Jian Lu j...@local.com
To: user@hbase.apache.org user@hbase.apache.org
Sent: Tue, Sep 7, 2010 5:44 pm
Subject: stop-hbase.sh takes forever (never stops)


Hi, could someone please tell me why stop-hbase.sh takes more than 24 hrs and 
still running?  I was able to started / stopped hbase in the past two months.  
Now it suddenly stops working.

I am running hbase-0.20.4 with Linux 64-bit CPU / 64-bit operating system.  I 
downloaded hbase-0.20.4 and run on a standalone mode 
(http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description)

Thanks!
Jack.

Dependencies for running mapreduce

2010-08-27 Thread Venkatesh


 

 The mapreduce job code I have (java app) depends on other libraries. It runs 
fine when the job is run locally
 but when I'm running on a true distributed setup..it's failing on 
dependencies..Do I have to put all the
libraies, propertty files (dependent) of my application in HADOOP_CLASSPATH 
..for the mapreduce to run  in a cluster?

thanks
venkatesh

jobtracker.jsp

2010-08-26 Thread Venkatesh


 

 I'm running map/reduce jobs from java app (table mapper  reducer) in true 
distributed
mode..I don't see anything in jobtracker page..Map/reduce job runs fine..Am I 
missing some config?

thanks
venkatesh

Re: jobtracker.jsp

2010-08-26 Thread Venkatesh


 Thanks J-D

I figured I did n't have mapred-site.xml in my WEB-INF/classes directory 
(classpth)
I copied that from the cluster ..that fixed part of it..Now i don't have 
zookeper in hadoop-env.sh:HADOOP_CLASSPATH
I distinctly looked at this link a while ago.. it did n't have zookeper listed 
.(i've everything else i.e hbase-*.)
perhaps i had a old link

can all the config in mapred-site.xml be added to hbase-site.xml?..It kind of 
works with them being separate..
just wondering..

Have one more question..I also have trouble stopping 
namenode/datanode/jobtracker..to make this classpath effective
Is there a force shutdown option? (other than kill -9)..?


 venkatesh


 

 

-Original Message-
From: Jean-Daniel Cryans jdcry...@apache.org
To: user@hbase.apache.org
Sent: Fri, Aug 27, 2010 12:10 am
Subject: Re: jobtracker.jsp


HBase needs to know about the job tracker, it could be on the same
machine or distant, and that's taken care by giving HBase mapred's
configurations. Here's the relevant documentation :
http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapreduce/package-summary.html#classpath

J-D

2010/8/26 xiujin yang xiujiny...@hotmail.com:

 Hi

 When I run Hbase performance, I met the same problem.
 When job are run on local, it don't show up on job list.

 Best

 Xiujin Yang.

 To: user@hbase.apache.org
 Subject: Re: jobtracker.jsp
 Date: Thu, 26 Aug 2010 22:30:09 -0400
 From: vramanatha...@aol.com


  yeah..log says it's running Locally..i've to figure out why..

 2010-08-26 08:49:01,491 INFO Thread-16 org.apache.hadoop.mapred.MapTask - 
Starting flush of map output
 2010-08-26 08:49:01,578 INFO Thread-16 org.apache.hadoop.mapred.TaskRunner - 
Task:attempt_local_0001_m_00_0 is done. And is in the process of commiting
 2010-08-26 08:49:01,586 INFO Thread-16 
 org.apache.hadoop.mapred.LocalJobRunner 
-
 2010-08-26 08:49:01,587 INFO Thread-16 org.apache.hadoop.mapred.TaskRunner - 
Task 'attempt_local_0001_m_00_0' done.
 2010-08-26 08:49:01,613 INFO Thread-16 
 org.apache.hadoop.mapred.LocalJobRunner 
-
 2010-08-26 08:49:01,630 INFO Thread-16 org.apache.hadoop.mapred.Merger - 
Merging 1 sorted segments
 2010-08-26 08:49:01,640 INFO Thread-16 org.apache.hadoop.mapred.Merger - 
 Down 
to the last merge-pass, with 0 segments left of total size: 0 bytes
 2010-08-26 08:49:01,640 INFO Thread-16 
 org.apache.hadoop.mapred.LocalJobRunner 
-
 2010-08-26 08:49:01,658 INFO Thread-16 org.apache.hadoop.mapred.TaskRunner - 
Task:attempt_local_0001_r_00_0 is done. And is in the process of commiting
 2010-08-26 08:49:01,659 INFO Thread-16 
 org.apache.hadoop.mapred.LocalJobRunner 
- reduce  reduce
 2010-08-26 08:49:01,660 INFO Thread-16 org.apache.hadoop.mapred.TaskRunner - 
Task 'attempt_local_0001_r_00_0' done.








 -Original Message-
 From: Jeff Zhang zjf...@gmail.com
 To: user@hbase.apache.org
 Sent: Thu, Aug 26, 2010 9:42 pm
 Subject: Re: jobtracker.jsp


 So what's the log in your client side ?


 On Thu, Aug 26, 2010 at 6:23 PM, Venkatesh vramanatha...@aol.com wrote:
 
 
 
   I'm running map/reduce jobs from java app (table mapper  reducer) in true
 distributed
  mode..I don't see anything in jobtracker page..Map/reduce job runs 
  fine..Am 
I
 missing some config?
 
  thanks
  venkatesh
 
 
 



 --
 Best Regards

 Jeff Zhang

Re: How to delete rows in a FIFO manner

2010-08-06 Thread Venkatesh


 I wrestled with that idea of time bounded tables..Would it make it harder to 
write code/run map reduce
on multiple tables ? Also, how do u decide to when to do the cut over (start of 
a new day, week/month..)
 if u do how to process data that cross those time boundaries efficiently..
Guess that is not your requirement..

If it is fixed time cut over, is n't enough to set the TTL timestamp ? 


 Interesting thread..thanks


 

 

-Original Message-
From: Thomas Downing tdown...@proteus-technologies.com
To: user@hbase.apache.org user@hbase.apache.org
Sent: Fri, Aug 6, 2010 11:39 am
Subject: Re: How to delete rows in a FIFO manner


Thanks for the suggestions.  The problem isn't generating the 
Delete objects, or the delete operation itself - both are fast 
enough.  The problem is generating the list of row keys from 
which the Delete objects are created. 
 
For now, the obvious work-around is to create and drop 
tables on the fly, using HBaseAdmin, with the tables being 
time-bounded. When the high end of a table passes the expiry 
time, just drop the table. When a table is written with the first 
record greater than the low bound, create a new table for the 
next time interval. 
 
As I am having other problems related to high ingest rates, 
the fact may be that I am just using the wrong tool for the job. 
 
Thanks 
 
td 
 
On 8/6/2010 10:24 AM, Jean-Daniel Cryans wrote: 
 If the inserts are coming from more than 1 client, and your are trying 
 to delete from only 1 client, then likely it won't work. You could try 
 using a pool of deleters (multiple threads that delete rows) that you 
 feed from the scanner. Or you could run a MapReduce that would 
 parallelize that for you, that takes your table as an input and that 
 outputs Delete objects. 
 
 J-D 
 
 On Fri, Aug 6, 2010 at 5:50 AM, Thomas Downing 
 tdown...@proteus-technologies.com  wrote: 
 Hi, 
 
 Continuing with testing HBase suitability in a high ingest rate 
 environment, I've come up with a new stumbling block, likely 
 due to my inexperience with HBase. 
 
 We want to keep and purge records on a time basis: i.e, when 
 a record is older than say, 24 hours, we want to purge it from 
 the database. 
 
 The problem that I am encountering is the only way I've found 
 to delete records using an arbitrary but strongly ordered over 
 time row id is to scan for rows from lower bound to upper 
 bound, then build an array of Delete using 
 
 for Result in ResultScanner 
 add new Delete( Result.getRow( ) ) to Delete array. 
 
 This method is far too slow to keep up with our ingest rate; the 
 iteration over the Results in the ResultScanner is the bottleneck, 
 even though the Scan is limited to a single small column in the 
 column family. 
 
 The obvious but naive solution is to use a sequential row id 
 where the lower and upper bound can be known.  This would 
 allow the building of the array of Delete objects without a scan 
 step.  Problem with this approach is how do you guarantee a 
 sequential and non-colliding row id across more than one Put'ing 
 process, and do it efficiently.  As it happens, I can do this, but 
 given the details of my operational requirements, it's not a simple 
 thing to do. 
 
 So I was hoping that I had just missed something.  The ideal 
 would be a Delete object that would take row id bounds in the 
 same way that Scan does, allowing the work to be done all 
 on the server side.  Does this exists somewhere?  Or is there 
 some other way to skin this cat? 
 
 Thanks 
 
 Thomas Downing 
 
   -- 
 Follow this link to mark it as spam: 
 http://mailfilter.proteus-technologies.com/cgi-bin/learn-msg.cgi?id=6574C2821B.A5164

HTable object - how long it is valid

2010-08-01 Thread Venkatesh


 Hi

If I construct new HTable() object upon my app init, is it valid until my app 
is shutdown?
I read in earlier postings that it is better to construct HTable once for 
performance reasons.
Wonder if underlying connection  other resources are kept around for ever for 
put/scan/..

Also ..when do I call close()..upon every operation (put/get/..) ? to avoid 
memory leaks

thanks
venkatesh

65 matches

Mail list logo