Re: RegionServer can't recover after a failure

2010-08-27 Thread Andrey Timerbaev
 You are running transactional hbase?  This is intentional I take it.

 Me neither.  Let me poke the transactional fellows and see if they can
 offer help.
 
 Thanks,
 St.Ack

Yes, I'm running a transactional hbase.

Thanks in advance for involving any of transactional experts.

Andrey






FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information.

2010-08-27 Thread Shuja Rehman
hi i was wondering if someone could help me with an hbase setup. i have a
cluster of 2 machines and I get this error when creating a table

FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to
localhost in a fully-distributed setup because it won't be reachable. See
Getting Started for more information.

The table is still created with the above error. I have set up
hbase-site.xml in in master and region server with no reference to
localhost. Here is settings of hbase-site.xml.

configuration
property
  namehbase.cluster.distributed/name
  valuetrue/value
/property
property
  namehbase.rootdir/name
  valuehdfs://hadoop.server1.com:8020/hbase/value
/property
property
  namehbase.zookeeper.quorum/name
  valuehadoop.server1.com/value
/property
/configuration

Any clue???

-- 
Regards
Shuja-ur-Rehman Baig
http://pk.linkedin.com/in/shujamughal
Cell: +92 3214207445


how many regions a regionserver can support

2010-08-27 Thread Jinsong Hu

Hi, There :
  Does anybody know how many region a regionserver can support ? I have 
regionservers with 8G ram and 1.5T disk and 4 core CPU.
I searched http://www.facebook.com/note.php?note_id=142473677002 and they 
say google target is 100 regions of 200M for each

regionserver.
 In my case, I have 2700 regions spread to 6 regionservers. each region is 
set to default size of 256M . and it seems it is still running fine. I am 
running CDH3.  I just wonder what is the upper limit so that I can do 
capacity planning. Does anybody know this ?


Jimmy. 



Re: FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information.

2010-08-27 Thread Shuja Rehman
Hi
I am using Cloudera Latest Distribution. Can u let me where this file could
be?

Thanks

On Fri, Aug 27, 2010 at 8:23 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:

 It means that there's a zoo.cfg file somewhere on HBase's classpath
 and that it doesn't contain the right configs (it takes precedence
 over the xml configurations). Simply remove the file or remove it from
 the classpath.

 BTW which version are you using?

 J-D

 On Fri, Aug 27, 2010 at 2:49 AM, Shuja Rehman shujamug...@gmail.com
 wrote:
  hi i was wondering if someone could help me with an hbase setup. i have a
  cluster of 2 machines and I get this error when creating a table
 
  FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to
  localhost in a fully-distributed setup because it won't be reachable. See
  Getting Started for more information.
 
  The table is still created with the above error. I have set up
  hbase-site.xml in in master and region server with no reference to
  localhost. Here is settings of hbase-site.xml.
 
  configuration
  property
   namehbase.cluster.distributed/name
   valuetrue/value
  /property
  property
   namehbase.rootdir/name
   valuehdfs://hadoop.server1.com:8020/hbase/value
  /property
  property
   namehbase.zookeeper.quorum/name
   valuehadoop.server1.com/value
  /property
  /configuration
 
  Any clue???
 
  --
  Regards
  Shuja-ur-Rehman Baig
  http://pk.linkedin.com/in/shujamughal
  Cell: +92 3214207445
 




-- 
Regards
Shuja-ur-Rehman Baig
http://pk.linkedin.com/in/shujamughal
Cell: +92 3214207445


Re: FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information.

2010-08-27 Thread Jean-Daniel Cryans
Somewhere in /etc, but you can simply issue a ps aux | grep java and
see what the classpath looks like and figure why it's taking the
zookeeper config that's shipped separately in CDH.

You can also post that question on their getsatisfaction page since
it's cloudera specifc, and unrelated to how Apache HBase is shipped.
It's not the first time I see a CDH3 user with this issue.

J-D

On Fri, Aug 27, 2010 at 10:30 AM, Shuja Rehman shujamug...@gmail.com wrote:
 Hi
 I am using Cloudera Latest Distribution. Can u let me where this file could
 be?

 Thanks

 On Fri, Aug 27, 2010 at 8:23 PM, Jean-Daniel Cryans 
 jdcry...@apache.orgwrote:

 It means that there's a zoo.cfg file somewhere on HBase's classpath
 and that it doesn't contain the right configs (it takes precedence
 over the xml configurations). Simply remove the file or remove it from
 the classpath.

 BTW which version are you using?

 J-D

 On Fri, Aug 27, 2010 at 2:49 AM, Shuja Rehman shujamug...@gmail.com
 wrote:
  hi i was wondering if someone could help me with an hbase setup. i have a
  cluster of 2 machines and I get this error when creating a table
 
  FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to
  localhost in a fully-distributed setup because it won't be reachable. See
  Getting Started for more information.
 
  The table is still created with the above error. I have set up
  hbase-site.xml in in master and region server with no reference to
  localhost. Here is settings of hbase-site.xml.
 
  configuration
  property
   namehbase.cluster.distributed/name
   valuetrue/value
  /property
  property
   namehbase.rootdir/name
   valuehdfs://hadoop.server1.com:8020/hbase/value
  /property
  property
   namehbase.zookeeper.quorum/name
   valuehadoop.server1.com/value
  /property
  /configuration
 
  Any clue???
 
  --
  Regards
  Shuja-ur-Rehman Baig
  http://pk.linkedin.com/in/shujamughal
  Cell: +92 3214207445
 




 --
 Regards
 Shuja-ur-Rehman Baig
 http://pk.linkedin.com/in/shujamughal
 Cell: +92 3214207445



RE: how many regions a regionserver can support

2010-08-27 Thread Jonathan Gray
There is no fixed limit, it has much more to do with the read/write load than 
the actual dataset size.

HBase is usually fine having very densely packed RegionServers, if much of the 
data is rarely accessed.  If you have extremely high numbers of regions per 
server and you are writing to all of these regions, or even reading from all of 
them, you could have issues.  Though storage capacity needs to be considered, 
capacity planning often has much more to do with how much memory you need to 
support the read/write load you expect.  Reads mostly from a performance POV 
but for writes, there are some important considerations related to the number 
of regions per server (and thus data density and determining your max region 
size).

In any case, you should probably increase your max size to 1GB or so and can go 
higher if necessary.

JG

 -Original Message-
 From: Jinsong Hu [mailto:jinsong...@hotmail.com]
 Sent: Friday, August 27, 2010 10:03 AM
 To: user@hbase.apache.org
 Subject: how many regions a regionserver can support
 
 Hi, There :
Does anybody know how many region a regionserver can support ? I
 have
 regionservers with 8G ram and 1.5T disk and 4 core CPU.
 I searched http://www.facebook.com/note.php?note_id=142473677002 and
 they
 say google target is 100 regions of 200M for each
 regionserver.
   In my case, I have 2700 regions spread to 6 regionservers. each
 region is
 set to default size of 256M . and it seems it is still running fine. I
 am
 running CDH3.  I just wonder what is the upper limit so that I can do
 capacity planning. Does anybody know this ?
 
 Jimmy.



Surge 2010 Early Registration ends Tuesday!

2010-08-27 Thread Jason Dixon
Early Bird Registration for Surge Scalability Conference 2010 ends next
Tuesday, August 31.  We have a killer lineup of speakers and architects
from across the Internet.  Listen to experts talk about the newest
methods and technologies for scaling your Web presence.

http://omniti.com/surge/2010/register

This year's event is all about the challenges faced (and overcome) in
real-life production architectures.  Meet the engineering talent from
some of the best and brightest throughout the Internet:

John Allspaw, Etsy
Theo Schlossnagle, OmniTI
Bryan Cantrill, Joyent
Rasmus Lerdorf, creator of PHP
Tom Cook, Facebook
Benjamin Black, fast_ip
Christopher Brown, Opscode
Artur Bergman, Wikia
Baron Schwartz, Percona
Paul Querna, Cloudkick

Surge 2010 takes place at the Tremont Grand Historic Venue on Sept 30
and Oct 1, 2010 in Baltimore, MD.  Register NOW for the Early Bird
discount and guarantee your seat to this year's event!


-- 
Jason Dixon
OmniTI Computer Consulting, Inc.
jdi...@omniti.com
443.325.1357 x.241


HBase Query

2010-08-27 Thread Shuja Rehman
Hi

I am trying to implement equivalent of this query in Hbase

*SELECT *value1, value2, value3 , (value1+value2) as calculatedValue
*FROM *myTable
*WHERE *year=2010 and month in (2,4,6) and day= 'Friday' and time='14:30'
 and Product='Abc'

The Hbase table has the row key with *MMDDHHMMSS *e.g *20100809023000 *and
Product information is in *ProductInfo *column family.

Can anybody help me how to proceed with this scenario?

Thanks in advance

-- 
Regards
Shuja-ur-Rehman Baig
http://pk.linkedin.com/in/shujamughal
Cell: +92 3214207445


Dependencies for running mapreduce

2010-08-27 Thread Venkatesh

 

 The mapreduce job code I have (java app) depends on other libraries. It runs 
fine when the job is run locally
 but when I'm running on a true distributed setup..it's failing on 
dependencies..Do I have to put all the
libraies, propertty files (dependent) of my application in HADOOP_CLASSPATH 
..for the mapreduce to run  in a cluster?

thanks
venkatesh




Re: HBase Query

2010-08-27 Thread Jean-Daniel Cryans
HBase with respect to querying should be treated as a (augmented)
key/value store, so it doesn't support ad hoc queries.

The closest this translates to in HBase would be for you to scan your
table for all values between row key 201002 and 201008312359
and then filter on the month, calculate if the returned row is a
friday, that time=1430, and that productinfo=abc. This can possibly
scan a lot of rows, taking a lot of time, and cannot be used for to
populate user-facing UIs.

You could also run it in 3 parallel scans, one for each month, then
filter the remaining and join (in your application) the results to
make it faster.

Good luck!

J-D

On Fri, Aug 27, 2010 at 2:34 PM, Shuja Rehman shujamug...@gmail.com wrote:
 Hi

 I am trying to implement equivalent of this query in Hbase

 *SELECT *value1, value2, value3 , (value1+value2) as calculatedValue
 *FROM *myTable
 *WHERE *year=2010 and month in (2,4,6) and day= 'Friday' and time='14:30'
  and Product='Abc'

 The Hbase table has the row key with *MMDDHHMMSS *e.g *20100809023000 *and
 Product information is in *ProductInfo *column family.

 Can anybody help me how to proceed with this scenario?

 Thanks in advance

 --
 Regards
 Shuja-ur-Rehman Baig
 http://pk.linkedin.com/in/shujamughal
 Cell: +92 3214207445



Re: RegionServer can't recover after a failure

2010-08-27 Thread James Kennedy
I can help. I'm a developer on the transactional hbase extension which you must 
be using.

The issue is that the global transaction log table is not created yet. You can 
do so simply by calling
HBaseBackedTransactionLogger.createTable() at the time when you are seeding the 
rest of your tables.

I apologize that the extension as given in GitHub is not yet mature.  While it 
works (HBase 0.21) it is poorly documented and needs more thorough testing.

We have recently updated it to work with HBase 0.89.20100726 and it is much 
more stable and very slightly better documented. We are waiting for an HBase 
patch submission before we push it to hbase-trx at github.

Thanks,

James Kennedy
Project Manager
Troove Inc.

On 2010-08-26, at 8:16 AM, Andrey Timerbaev wrote:

 Dear experts,
 
 Could you kindly suggest, how to help the RegionServer to complete
 initialization in the following situation:
 
 After a failure of one or RegionServers, which is running on a dedicated node 
 in
 a HBase/Hadoop cluster (HBase v.0.20.3), the RegionServer can't initialize
 available tables. The region server's log contains this exception:
 
 2010-08-26 18:56:49,073 INFO
 org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager: 
 Region
 log has 9 unfinished transactions. Going to the transaction log to resolve
 2010-08-26 18:56:49,091 DEBUG
 org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit for
 row  in tableName .META.: location server 10.2.146.41:60020, location region
 name .META.,,1
 2010-08-26 18:56:49,178 ERROR
 org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening
 STAT_STARTUPS_TABLE,,1282738349665
 java.lang.RuntimeException: Table not created. Call createTable() first
at
 org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.
 initTable(HBaseBackedTransactionLogger.java:76)
at
 org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.init
 (HBaseBackedTransactionLogger.java:69)
at
 org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
 getGlobalTransactionLog(THLogRecoveryManager.java:256)
at
 org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
 resolvePendingTransaction(THLogRecoveryManager.java:225)
at
 org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager.
 getCommitsFromLog(THLogRecoveryManager.java:206)
at
 org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion.
 doReconstructionLog(TransactionalRegion.java:145)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize
 (HRegion.java:326)
at
 org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionServer.
 instantiateRegion(TransactionalRegionServer.java:121)
at
 org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion
 (HRegionServer.java:1531)
at
 org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run
 (HRegionServer.java:1451)
at java.lang.Thread.run(Thread.java:619)
 
 After a look into HBase source code I found out, that the Table not created.
 Call createTable() message appears, if the HBaseBackedTransactionLogger is
 unable to find the __GLOBAL_TRX_LOG__ table. But I've got no idea, where the
 table should be, whether it is critical and what should I do in this 
 situation.
 
 Any help is appreciated.
 
 Andrey
 
 



Re: HBase Query

2010-08-27 Thread Daniel Einspanjer
Xavier recently mentioned some code we use at Mozilla that should help here.
It is a unioning scanner that would let you define a list of scan ranges to run 
for the job. You'd set a prefix for each Friday in the selected months and the 
range of 143000 through 143060
Then you'd apply a filter on the product.
When it runs, it would generate all the scans you need and run them in a single 
job.
If you want to give it a try, search the mailing list archive. I'll check with 
Xavier to see if the code can be cleaned up and posted somewhere too.

-Daniel

Shuja Rehman shujamug...@gmail.com wrote:

Hi

I am trying to implement equivalent of this query in Hbase

*SELECT *value1, value2, value3 , (value1+value2) as calculatedValue
*FROM *myTable
*WHERE *year=2010 and month in (2,4,6) and day= 'Friday' and time='14:30'
 and Product='Abc'

The Hbase table has the row key with *MMDDHHMMSS *e.g *20100809023000 *and
Product information is in *ProductInfo *column family.

Can anybody help me how to proceed with this scenario?

Thanks in advance

-- 
Regards
Shuja-ur-Rehman Baig
http://pk.linkedin.com/in/shujamughal
Cell: +92 3214207445


Initial region loads in hbase..

2010-08-27 Thread Vidhyashankar Venkataraman
Hi guys,
  A couple of days back, I had posted a problem on regions taking too much time 
to load when I restart Hbase.. I have a table that has around 80 K regions on 
650 nodes (!) ..
  I was checking the logs in the master and I notice that the time it takes 
from assigning a region to a region server to the point when it recognizes that 
the region is open in that server takes around 20-30 minutes!
   Apart from master being the bottleneck here, can you guys let me know what 
the other possible cases are as to why this may happen?

Thank you
Vidhya

Below is an example for region with start key 003404803994 where the 
assignment takes place at 22:59 while the confirmation that it got open came at 
23:19...

2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
Current assignment of 
DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not 
valid;  serverAddress=, startCode=0 unknown.
2010-08-27 22:59:02,643 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 
73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
NodeCreated with state SyncConnected with path 
/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
2010-08-27 22:59:02,645 DEBUG 
org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated 
with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
2010-08-27 22:59:02,645 DEBUG 
org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
zkEvent NodeCreated state:SyncConnected 
path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
2010-08-27 22:59:02,645 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated
 ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
Created/updated UNASSIGNED zNode 
DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state 
M2ZK_REGION_OFFLINE
2010-08-27 22:59:02,646 DEBUG 
org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ 
M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED
2010-08-27 22:59:02,646 DEBUG 
org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event 
NodeChildrenChanged with path /hbase/UNASSIGNED
2010-08-27 22:59:02,646 DEBUG 
org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED
2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to 
assign=1, regions to give other servers than this=0, isMetaAssign=false
2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: 
Assigning region 
DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to 
b3130020.yst.yahoo.net,60020,1282940735627
2010-08-27 22:59:02,653 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update 
UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in 
state = M2ZK_REGION_OFFLINE
2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
Created UNASSIGNED zNode 
DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state 
M2ZK_REGION_OFFLINE




AND THEN,

2010-08-27 23:18:53,591 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: 
DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open on 
b3130020.yst.yahoo.net,60020,1282940735627
2010-08-27 23:18:53,591 INFO 
org.apache.hadoop.hbase.master.RegionServerOperation: Updated row 
DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
region .META.,,1 with startcode=1282940735627, 
server=b3130020.yst.yahoo.net:60020
2010-08-27 23:18:53,677 DEBUG 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode 
/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper as region is 
open...




Re: Initial region loads in hbase..

2010-08-27 Thread Ryan Rawson
Hi,

No one has run HBase at that scale that I know of.  Perhaps you can
poke at the problem some more and come up with a working theory and
possibly a code fix?

Good luck!
-ryan

On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman
vidhy...@yahoo-inc.com wrote:
 Hi guys,
  A couple of days back, I had posted a problem on regions taking too much 
 time to load when I restart Hbase.. I have a table that has around 80 K 
 regions on 650 nodes (!) ..
  I was checking the logs in the master and I notice that the time it takes 
 from assigning a region to a region server to the point when it recognizes 
 that the region is open in that server takes around 20-30 minutes!
   Apart from master being the bottleneck here, can you guys let me know what 
 the other possible cases are as to why this may happen?

 Thank you
 Vidhya

 Below is an example for region with start key 003404803994 where the 
 assignment takes place at 22:59 while the confirmation that it got open came 
 at 23:19...

 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
 Current assignment of 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not 
 valid;  serverAddress=, startCode=0 unknown.
 2010-08-27 22:59:02,643 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
 NodeCreated with state SyncConnected with path 
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated 
 with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
 zkEvent NodeCreated state:SyncConnected 
 path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
 b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated
  ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created/updated UNASSIGNED zNode 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 state M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ 
 M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
 NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event 
 NodeChildrenChanged with path /hbase/UNASSIGNED
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
 zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED
 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to 
 assign=1, regions to give other servers than this=0, isMetaAssign=false
 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: 
 Assigning region 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to 
 b3130020.yst.yahoo.net,60020,1282940735627
 2010-08-27 22:59:02,653 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update 
 UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in 
 state = M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created UNASSIGNED zNode 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 state M2ZK_REGION_OFFLINE




 AND THEN,

 2010-08-27 23:18:53,591 INFO 
 org.apache.hadoop.hbase.master.RegionServerOperation: 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open 
 on b3130020.yst.yahoo.net,60020,1282940735627
 2010-08-27 23:18:53,591 INFO 
 org.apache.hadoop.hbase.master.RegionServerOperation: Updated row 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 region .META.,,1 with startcode=1282940735627, 
 server=b3130020.yst.yahoo.net:60020
 2010-08-27 23:18:53,677 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode 
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper as region is 
 open...





Re: Initial region loads in hbase..

2010-08-27 Thread Stack
In 0.20, open on a regionserver is single-threaded.  Could that be it?
 The server has lots of regions to open and its taking time?  Is the
meta table being beat up?  Could this be holding up region opens?

Good luck V,
St.Ack


On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman
vidhy...@yahoo-inc.com wrote:
 Hi guys,
  A couple of days back, I had posted a problem on regions taking too much 
 time to load when I restart Hbase.. I have a table that has around 80 K 
 regions on 650 nodes (!) ..
  I was checking the logs in the master and I notice that the time it takes 
 from assigning a region to a region server to the point when it recognizes 
 that the region is open in that server takes around 20-30 minutes!
   Apart from master being the bottleneck here, can you guys let me know what 
 the other possible cases are as to why this may happen?

 Thank you
 Vidhya

 Below is an example for region with start key 003404803994 where the 
 assignment takes place at 22:59 while the confirmation that it got open came 
 at 23:19...

 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
 Current assignment of 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not 
 valid;  serverAddress=, startCode=0 unknown.
 2010-08-27 22:59:02,643 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
 NodeCreated with state SyncConnected with path 
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated 
 with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
 zkEvent NodeCreated state:SyncConnected 
 path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
 b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated
  ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created/updated UNASSIGNED zNode 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 state M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ 
 M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
 NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event 
 NodeChildrenChanged with path /hbase/UNASSIGNED
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
 zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED
 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to 
 assign=1, regions to give other servers than this=0, isMetaAssign=false
 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: 
 Assigning region 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to 
 b3130020.yst.yahoo.net,60020,1282940735627
 2010-08-27 22:59:02,653 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update 
 UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in 
 state = M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created UNASSIGNED zNode 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 state M2ZK_REGION_OFFLINE




 AND THEN,

 2010-08-27 23:18:53,591 INFO 
 org.apache.hadoop.hbase.master.RegionServerOperation: 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open 
 on b3130020.yst.yahoo.net,60020,1282940735627
 2010-08-27 23:18:53,591 INFO 
 org.apache.hadoop.hbase.master.RegionServerOperation: Updated row 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 region .META.,,1 with startcode=1282940735627, 
 server=b3130020.yst.yahoo.net:60020
 2010-08-27 23:18:53,677 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode 
 

Re: Initial region loads in hbase..

2010-08-27 Thread Stack
Is the master working hard?  If so, maybe make this number bigger:

  property
namehbase.regionserver.msginterval/name
value1000/value
descriptionInterval between messages from the RegionServer to HMaster
in milliseconds. Use a high value like 3000 for clusters with more than 10
nodes. Default is 1 second so that HBase seems more 'live'.
/description
  /property

All regionservers check in every second by default.  When they check
in during startup, they'll report status on the regions that are
currently being opened... maybe this is flooding the master?

For your bigger cluster you should probably up the above number anyways?

Are the regions being assigned out slowly?  If you tail regionserver
log, it'll report OPENING messages... are regions taking a long time
to move to OPEN?

St.Ack



On Fri, Aug 27, 2010 at 5:35 PM, Stack st...@duboce.net wrote:
 In 0.20, open on a regionserver is single-threaded.  Could that be it?
  The server has lots of regions to open and its taking time?  Is the
 meta table being beat up?  Could this be holding up region opens?

 Good luck V,
 St.Ack


 On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman
 vidhy...@yahoo-inc.com wrote:
 Hi guys,
  A couple of days back, I had posted a problem on regions taking too much 
 time to load when I restart Hbase.. I have a table that has around 80 K 
 regions on 650 nodes (!) ..
  I was checking the logs in the master and I notice that the time it takes 
 from assigning a region to a region server to the point when it recognizes 
 that the region is open in that server takes around 20-30 minutes!
   Apart from master being the bottleneck here, can you guys let me know what 
 the other possible cases are as to why this may happen?

 Thank you
 Vidhya

 Below is an example for region with start key 003404803994 where the 
 assignment takes place at 22:59 while the confirmation that it got open came 
 at 23:19...

 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: 
 Current assignment of 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is 
 not valid;  serverAddress=, startCode=0 unknown.
 2010-08-27 22:59:02,643 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED 
 region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
 NodeCreated with state SyncConnected with path 
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated 
 with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
 zkEvent NodeCreated state:SyncConnected 
 path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,645 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: 
 b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated
  ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Created/updated UNASSIGNED zNode 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in 
 state M2ZK_REGION_OFFLINE
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ 
 M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event 
 NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event 
 NodeChildrenChanged with path /hbase/UNASSIGNED
 2010-08-27 22:59:02,646 DEBUG 
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got 
 zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED
 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to 
 assign=1, regions to give other servers than this=0, isMetaAssign=false
 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
 Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, 
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: 
 Assigning region 
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to 
 b3130020.yst.yahoo.net,60020,1282940735627
 2010-08-27 22:59:02,653 DEBUG 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update 
 UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it 

RE: Initial region loads in hbase..

2010-08-27 Thread Jonathan Gray
Vidhya,

Could you post a snippet of an RS log during this time?  You should be able to 
see what's happening between when the OPEN message gets there and the OPEN 
completes.

Like Stack said, it's probably that its single-threaded in the version you're 
using and with all the file opening, your NN and DNs are under heavy load.  Do 
you see io-wait or anything else jump up across the cluster at this time?  You 
have ganglia setup on this bad boy?

JG

 -Original Message-
 From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of
 Stack
 Sent: Friday, August 27, 2010 5:36 PM
 To: user@hbase.apache.org
 Subject: Re: Initial region loads in hbase..
 
 In 0.20, open on a regionserver is single-threaded.  Could that be it?
  The server has lots of regions to open and its taking time?  Is the
 meta table being beat up?  Could this be holding up region opens?
 
 Good luck V,
 St.Ack
 
 
 On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman
 vidhy...@yahoo-inc.com wrote:
  Hi guys,
   A couple of days back, I had posted a problem on regions taking too
 much time to load when I restart Hbase.. I have a table that has around
 80 K regions on 650 nodes (!) ..
   I was checking the logs in the master and I notice that the time it
 takes from assigning a region to a region server to the point when it
 recognizes that the region is open in that server takes around 20-30
 minutes!
    Apart from master being the bottleneck here, can you guys let me
 know what the other possible cases are as to why this may happen?
 
  Thank you
  Vidhya
 
  Below is an example for region with start key 003404803994 where
 the assignment takes place at 22:59 while the confirmation that it got
 open came at 23:19...
 
  2010-08-27 22:59:02,642 DEBUG
 org.apache.hadoop.hbase.master.BaseScanner: Current assignment of
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
 is not valid;  serverAddress=, startCode=0 unknown.
  2010-08-27 22:59:02,643 DEBUG
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED
 region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE
  2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster:
 Event NodeCreated with state SyncConnected with path
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
  2010-08-27 22:59:02,645 DEBUG
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event
 NodeCreated with path
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
  2010-08-27 22:59:02,645 DEBUG
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS:
 Got zkEvent NodeCreated state:SyncConnected
 path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1
  2010-08-27 22:59:02,645 DEBUG
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper:
 b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b
 3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.h
 base.master.HMasterCreated ZNode
 /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper
  2010-08-27 22:59:02,646 DEBUG
 org.apache.hadoop.hbase.master.RegionManager: Created/updated
 UNASSIGNED zNode
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
 in state M2ZK_REGION_OFFLINE
  2010-08-27 22:59:02,646 DEBUG
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [
 M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1
  2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster:
 Event NodeChildrenChanged with state SyncConnected with path
 /hbase/UNASSIGNED
  2010-08-27 22:59:02,646 DEBUG
 org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event
 NodeChildrenChanged with path /hbase/UNASSIGNED
  2010-08-27 22:59:02,646 DEBUG
 org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS:
 Got zkEvent NodeChildrenChanged state:SyncConnected
 path:/hbase/UNASSIGNED
  2010-08-27 22:59:02,647 DEBUG
 org.apache.hadoop.hbase.master.RegionManager: Assigning for
 serverName=b3130020.yst.yahoo.net,60020,1282940735627,
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total
 nregions to assign=1, regions to give other servers than this=0,
 isMetaAssign=false
  2010-08-27 22:59:02,647 DEBUG
 org.apache.hadoop.hbase.master.RegionManager: Assigning
 serverName=b3130020.yst.yahoo.net,60020,1282940735627,
 load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions
  2010-08-27 22:59:02,647 INFO
 org.apache.hadoop.hbase.master.RegionManager: Assigning region
 DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1.
 to b3130020.yst.yahoo.net,60020,1282940735627
  2010-08-27 22:59:02,653 DEBUG
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update
 UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists
 in state = M2ZK_REGION_OFFLINE
  2010-08-27 22:59:02,653 DEBUG
 org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode
 

regionserver skew

2010-08-27 Thread Sharma, Avani
I have a few questions related to reading from hbase -



1.   How can I detect a regionserver skew. In other words, one regionserver 
is being hit more than the others ?

When I look at the master log, it states
org.apache.hadoop.hbase.master.ServerManager: 3 region servers, 0 dead, 
average load 23.668

Does that mean that the load is balanced? And in case it is not, do I need to 
redesign or reload my Hbase table ?  any other options ?


2.   Is it okay to have stargate running on more than one node in the 
cluster? I am using stargate and libcurl to read from Hbase and to speed this 
up, may be hitting different stargate servers could help ? Any cons to this?



3.   Is there a way I can get more than one version of a row via stargate ? 
 I tried the url with ?v=2 in the end, but it did not work.

Thanks,
Avani Sharma