Re: RegionServer can't recover after a failure
You are running transactional hbase? This is intentional I take it. Me neither. Let me poke the transactional fellows and see if they can offer help. Thanks, St.Ack Yes, I'm running a transactional hbase. Thanks in advance for involving any of transactional experts. Andrey
FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information.
hi i was wondering if someone could help me with an hbase setup. i have a cluster of 2 machines and I get this error when creating a table FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information. The table is still created with the above error. I have set up hbase-site.xml in in master and region server with no reference to localhost. Here is settings of hbase-site.xml. configuration property namehbase.cluster.distributed/name valuetrue/value /property property namehbase.rootdir/name valuehdfs://hadoop.server1.com:8020/hbase/value /property property namehbase.zookeeper.quorum/name valuehadoop.server1.com/value /property /configuration Any clue??? -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445
how many regions a regionserver can support
Hi, There : Does anybody know how many region a regionserver can support ? I have regionservers with 8G ram and 1.5T disk and 4 core CPU. I searched http://www.facebook.com/note.php?note_id=142473677002 and they say google target is 100 regions of 200M for each regionserver. In my case, I have 2700 regions spread to 6 regionservers. each region is set to default size of 256M . and it seems it is still running fine. I am running CDH3. I just wonder what is the upper limit so that I can do capacity planning. Does anybody know this ? Jimmy.
Re: FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information.
Hi I am using Cloudera Latest Distribution. Can u let me where this file could be? Thanks On Fri, Aug 27, 2010 at 8:23 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: It means that there's a zoo.cfg file somewhere on HBase's classpath and that it doesn't contain the right configs (it takes precedence over the xml configurations). Simply remove the file or remove it from the classpath. BTW which version are you using? J-D On Fri, Aug 27, 2010 at 2:49 AM, Shuja Rehman shujamug...@gmail.com wrote: hi i was wondering if someone could help me with an hbase setup. i have a cluster of 2 machines and I get this error when creating a table FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information. The table is still created with the above error. I have set up hbase-site.xml in in master and region server with no reference to localhost. Here is settings of hbase-site.xml. configuration property namehbase.cluster.distributed/name valuetrue/value /property property namehbase.rootdir/name valuehdfs://hadoop.server1.com:8020/hbase/value /property property namehbase.zookeeper.quorum/name valuehadoop.server1.com/value /property /configuration Any clue??? -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445 -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445
Re: FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information.
Somewhere in /etc, but you can simply issue a ps aux | grep java and see what the classpath looks like and figure why it's taking the zookeeper config that's shipped separately in CDH. You can also post that question on their getsatisfaction page since it's cloudera specifc, and unrelated to how Apache HBase is shipped. It's not the first time I see a CDH3 user with this issue. J-D On Fri, Aug 27, 2010 at 10:30 AM, Shuja Rehman shujamug...@gmail.com wrote: Hi I am using Cloudera Latest Distribution. Can u let me where this file could be? Thanks On Fri, Aug 27, 2010 at 8:23 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: It means that there's a zoo.cfg file somewhere on HBase's classpath and that it doesn't contain the right configs (it takes precedence over the xml configurations). Simply remove the file or remove it from the classpath. BTW which version are you using? J-D On Fri, Aug 27, 2010 at 2:49 AM, Shuja Rehman shujamug...@gmail.com wrote: hi i was wondering if someone could help me with an hbase setup. i have a cluster of 2 machines and I get this error when creating a table FATAL zookeeper.HQuorumPeer: The server in zoo.cfg cannot be set to localhost in a fully-distributed setup because it won't be reachable. See Getting Started for more information. The table is still created with the above error. I have set up hbase-site.xml in in master and region server with no reference to localhost. Here is settings of hbase-site.xml. configuration property namehbase.cluster.distributed/name valuetrue/value /property property namehbase.rootdir/name valuehdfs://hadoop.server1.com:8020/hbase/value /property property namehbase.zookeeper.quorum/name valuehadoop.server1.com/value /property /configuration Any clue??? -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445 -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445
RE: how many regions a regionserver can support
There is no fixed limit, it has much more to do with the read/write load than the actual dataset size. HBase is usually fine having very densely packed RegionServers, if much of the data is rarely accessed. If you have extremely high numbers of regions per server and you are writing to all of these regions, or even reading from all of them, you could have issues. Though storage capacity needs to be considered, capacity planning often has much more to do with how much memory you need to support the read/write load you expect. Reads mostly from a performance POV but for writes, there are some important considerations related to the number of regions per server (and thus data density and determining your max region size). In any case, you should probably increase your max size to 1GB or so and can go higher if necessary. JG -Original Message- From: Jinsong Hu [mailto:jinsong...@hotmail.com] Sent: Friday, August 27, 2010 10:03 AM To: user@hbase.apache.org Subject: how many regions a regionserver can support Hi, There : Does anybody know how many region a regionserver can support ? I have regionservers with 8G ram and 1.5T disk and 4 core CPU. I searched http://www.facebook.com/note.php?note_id=142473677002 and they say google target is 100 regions of 200M for each regionserver. In my case, I have 2700 regions spread to 6 regionservers. each region is set to default size of 256M . and it seems it is still running fine. I am running CDH3. I just wonder what is the upper limit so that I can do capacity planning. Does anybody know this ? Jimmy.
Surge 2010 Early Registration ends Tuesday!
Early Bird Registration for Surge Scalability Conference 2010 ends next Tuesday, August 31. We have a killer lineup of speakers and architects from across the Internet. Listen to experts talk about the newest methods and technologies for scaling your Web presence. http://omniti.com/surge/2010/register This year's event is all about the challenges faced (and overcome) in real-life production architectures. Meet the engineering talent from some of the best and brightest throughout the Internet: John Allspaw, Etsy Theo Schlossnagle, OmniTI Bryan Cantrill, Joyent Rasmus Lerdorf, creator of PHP Tom Cook, Facebook Benjamin Black, fast_ip Christopher Brown, Opscode Artur Bergman, Wikia Baron Schwartz, Percona Paul Querna, Cloudkick Surge 2010 takes place at the Tremont Grand Historic Venue on Sept 30 and Oct 1, 2010 in Baltimore, MD. Register NOW for the Early Bird discount and guarantee your seat to this year's event! -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.com 443.325.1357 x.241
HBase Query
Hi I am trying to implement equivalent of this query in Hbase *SELECT *value1, value2, value3 , (value1+value2) as calculatedValue *FROM *myTable *WHERE *year=2010 and month in (2,4,6) and day= 'Friday' and time='14:30' and Product='Abc' The Hbase table has the row key with *MMDDHHMMSS *e.g *20100809023000 *and Product information is in *ProductInfo *column family. Can anybody help me how to proceed with this scenario? Thanks in advance -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445
Dependencies for running mapreduce
The mapreduce job code I have (java app) depends on other libraries. It runs fine when the job is run locally but when I'm running on a true distributed setup..it's failing on dependencies..Do I have to put all the libraies, propertty files (dependent) of my application in HADOOP_CLASSPATH ..for the mapreduce to run in a cluster? thanks venkatesh
Re: HBase Query
HBase with respect to querying should be treated as a (augmented) key/value store, so it doesn't support ad hoc queries. The closest this translates to in HBase would be for you to scan your table for all values between row key 201002 and 201008312359 and then filter on the month, calculate if the returned row is a friday, that time=1430, and that productinfo=abc. This can possibly scan a lot of rows, taking a lot of time, and cannot be used for to populate user-facing UIs. You could also run it in 3 parallel scans, one for each month, then filter the remaining and join (in your application) the results to make it faster. Good luck! J-D On Fri, Aug 27, 2010 at 2:34 PM, Shuja Rehman shujamug...@gmail.com wrote: Hi I am trying to implement equivalent of this query in Hbase *SELECT *value1, value2, value3 , (value1+value2) as calculatedValue *FROM *myTable *WHERE *year=2010 and month in (2,4,6) and day= 'Friday' and time='14:30' and Product='Abc' The Hbase table has the row key with *MMDDHHMMSS *e.g *20100809023000 *and Product information is in *ProductInfo *column family. Can anybody help me how to proceed with this scenario? Thanks in advance -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445
Re: RegionServer can't recover after a failure
I can help. I'm a developer on the transactional hbase extension which you must be using. The issue is that the global transaction log table is not created yet. You can do so simply by calling HBaseBackedTransactionLogger.createTable() at the time when you are seeding the rest of your tables. I apologize that the extension as given in GitHub is not yet mature. While it works (HBase 0.21) it is poorly documented and needs more thorough testing. We have recently updated it to work with HBase 0.89.20100726 and it is much more stable and very slightly better documented. We are waiting for an HBase patch submission before we push it to hbase-trx at github. Thanks, James Kennedy Project Manager Troove Inc. On 2010-08-26, at 8:16 AM, Andrey Timerbaev wrote: Dear experts, Could you kindly suggest, how to help the RegionServer to complete initialization in the following situation: After a failure of one or RegionServers, which is running on a dedicated node in a HBase/Hadoop cluster (HBase v.0.20.3), the RegionServer can't initialize available tables. The region server's log contains this exception: 2010-08-26 18:56:49,073 INFO org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager: Region log has 9 unfinished transactions. Going to the transaction log to resolve 2010-08-26 18:56:49,091 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers: Cache hit for row in tableName .META.: location server 10.2.146.41:60020, location region name .META.,,1 2010-08-26 18:56:49,178 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Error opening STAT_STARTUPS_TABLE,,1282738349665 java.lang.RuntimeException: Table not created. Call createTable() first at org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger. initTable(HBaseBackedTransactionLogger.java:76) at org.apache.hadoop.hbase.client.transactional.HBaseBackedTransactionLogger.init (HBaseBackedTransactionLogger.java:69) at org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager. getGlobalTransactionLog(THLogRecoveryManager.java:256) at org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager. resolvePendingTransaction(THLogRecoveryManager.java:225) at org.apache.hadoop.hbase.regionserver.transactional.THLogRecoveryManager. getCommitsFromLog(THLogRecoveryManager.java:206) at org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegion. doReconstructionLog(TransactionalRegion.java:145) at org.apache.hadoop.hbase.regionserver.HRegion.initialize (HRegion.java:326) at org.apache.hadoop.hbase.regionserver.transactional.TransactionalRegionServer. instantiateRegion(TransactionalRegionServer.java:121) at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion (HRegionServer.java:1531) at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run (HRegionServer.java:1451) at java.lang.Thread.run(Thread.java:619) After a look into HBase source code I found out, that the Table not created. Call createTable() message appears, if the HBaseBackedTransactionLogger is unable to find the __GLOBAL_TRX_LOG__ table. But I've got no idea, where the table should be, whether it is critical and what should I do in this situation. Any help is appreciated. Andrey
Re: HBase Query
Xavier recently mentioned some code we use at Mozilla that should help here. It is a unioning scanner that would let you define a list of scan ranges to run for the job. You'd set a prefix for each Friday in the selected months and the range of 143000 through 143060 Then you'd apply a filter on the product. When it runs, it would generate all the scans you need and run them in a single job. If you want to give it a try, search the mailing list archive. I'll check with Xavier to see if the code can be cleaned up and posted somewhere too. -Daniel Shuja Rehman shujamug...@gmail.com wrote: Hi I am trying to implement equivalent of this query in Hbase *SELECT *value1, value2, value3 , (value1+value2) as calculatedValue *FROM *myTable *WHERE *year=2010 and month in (2,4,6) and day= 'Friday' and time='14:30' and Product='Abc' The Hbase table has the row key with *MMDDHHMMSS *e.g *20100809023000 *and Product information is in *ProductInfo *column family. Can anybody help me how to proceed with this scenario? Thanks in advance -- Regards Shuja-ur-Rehman Baig http://pk.linkedin.com/in/shujamughal Cell: +92 3214207445
Initial region loads in hbase..
Hi guys, A couple of days back, I had posted a problem on regions taking too much time to load when I restart Hbase.. I have a table that has around 80 K regions on 650 nodes (!) .. I was checking the logs in the master and I notice that the time it takes from assigning a region to a region server to the point when it recognizes that the region is open in that server takes around 20-30 minutes! Apart from master being the bottleneck here, can you guys let me know what the other possible cases are as to why this may happen? Thank you Vidhya Below is an example for region with start key 003404803994 where the assignment takes place at 22:59 while the confirmation that it got open came at 23:19... 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not valid; serverAddress=, startCode=0 unknown. 2010-08-27 22:59:02,643 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeChildrenChanged with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to assign=1, regions to give other servers than this=0, isMetaAssign=false 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE AND THEN, 2010-08-27 23:18:53,591 INFO org.apache.hadoop.hbase.master.RegionServerOperation: DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open on b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 23:18:53,591 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in region .META.,,1 with startcode=1282940735627, server=b3130020.yst.yahoo.net:60020 2010-08-27 23:18:53,677 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper as region is open...
Re: Initial region loads in hbase..
Hi, No one has run HBase at that scale that I know of. Perhaps you can poke at the problem some more and come up with a working theory and possibly a code fix? Good luck! -ryan On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: Hi guys, A couple of days back, I had posted a problem on regions taking too much time to load when I restart Hbase.. I have a table that has around 80 K regions on 650 nodes (!) .. I was checking the logs in the master and I notice that the time it takes from assigning a region to a region server to the point when it recognizes that the region is open in that server takes around 20-30 minutes! Apart from master being the bottleneck here, can you guys let me know what the other possible cases are as to why this may happen? Thank you Vidhya Below is an example for region with start key 003404803994 where the assignment takes place at 22:59 while the confirmation that it got open came at 23:19... 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not valid; serverAddress=, startCode=0 unknown. 2010-08-27 22:59:02,643 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeChildrenChanged with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to assign=1, regions to give other servers than this=0, isMetaAssign=false 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE AND THEN, 2010-08-27 23:18:53,591 INFO org.apache.hadoop.hbase.master.RegionServerOperation: DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open on b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 23:18:53,591 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in region .META.,,1 with startcode=1282940735627, server=b3130020.yst.yahoo.net:60020 2010-08-27 23:18:53,677 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper as region is open...
Re: Initial region loads in hbase..
In 0.20, open on a regionserver is single-threaded. Could that be it? The server has lots of regions to open and its taking time? Is the meta table being beat up? Could this be holding up region opens? Good luck V, St.Ack On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: Hi guys, A couple of days back, I had posted a problem on regions taking too much time to load when I restart Hbase.. I have a table that has around 80 K regions on 650 nodes (!) .. I was checking the logs in the master and I notice that the time it takes from assigning a region to a region server to the point when it recognizes that the region is open in that server takes around 20-30 minutes! Apart from master being the bottleneck here, can you guys let me know what the other possible cases are as to why this may happen? Thank you Vidhya Below is an example for region with start key 003404803994 where the assignment takes place at 22:59 while the confirmation that it got open came at 23:19... 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not valid; serverAddress=, startCode=0 unknown. 2010-08-27 22:59:02,643 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeChildrenChanged with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to assign=1, regions to give other servers than this=0, isMetaAssign=false 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE AND THEN, 2010-08-27 23:18:53,591 INFO org.apache.hadoop.hbase.master.RegionServerOperation: DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. open on b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 23:18:53,591 INFO org.apache.hadoop.hbase.master.RegionServerOperation: Updated row DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in region .META.,,1 with startcode=1282940735627, server=b3130020.yst.yahoo.net:60020 2010-08-27 23:18:53,677 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Deleting ZNode
Re: Initial region loads in hbase..
Is the master working hard? If so, maybe make this number bigger: property namehbase.regionserver.msginterval/name value1000/value descriptionInterval between messages from the RegionServer to HMaster in milliseconds. Use a high value like 3000 for clusters with more than 10 nodes. Default is 1 second so that HBase seems more 'live'. /description /property All regionservers check in every second by default. When they check in during startup, they'll report status on the regions that are currently being opened... maybe this is flooding the master? For your bigger cluster you should probably up the above number anyways? Are the regions being assigned out slowly? If you tail regionserver log, it'll report OPENING messages... are regions taking a long time to move to OPEN? St.Ack On Fri, Aug 27, 2010 at 5:35 PM, Stack st...@duboce.net wrote: In 0.20, open on a regionserver is single-threaded. Could that be it? The server has lots of regions to open and its taking time? Is the meta table being beat up? Could this be holding up region opens? Good luck V, St.Ack On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: Hi guys, A couple of days back, I had posted a problem on regions taking too much time to load when I restart Hbase.. I have a table that has around 80 K regions on 650 nodes (!) .. I was checking the logs in the master and I notice that the time it takes from assigning a region to a region server to the point when it recognizes that the region is open in that server takes around 20-30 minutes! Apart from master being the bottleneck here, can you guys let me know what the other possible cases are as to why this may happen? Thank you Vidhya Below is an example for region with start key 003404803994 where the assignment takes place at 22:59 while the confirmation that it got open came at 23:19... 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not valid; serverAddress=, startCode=0 unknown. 2010-08-27 22:59:02,643 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.hbase.master.HMasterCreated ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeChildrenChanged with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to assign=1, regions to give other servers than this=0, isMetaAssign=false 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it
RE: Initial region loads in hbase..
Vidhya, Could you post a snippet of an RS log during this time? You should be able to see what's happening between when the OPEN message gets there and the OPEN completes. Like Stack said, it's probably that its single-threaded in the version you're using and with all the file opening, your NN and DNs are under heavy load. Do you see io-wait or anything else jump up across the cluster at this time? You have ganglia setup on this bad boy? JG -Original Message- From: saint@gmail.com [mailto:saint@gmail.com] On Behalf Of Stack Sent: Friday, August 27, 2010 5:36 PM To: user@hbase.apache.org Subject: Re: Initial region loads in hbase.. In 0.20, open on a regionserver is single-threaded. Could that be it? The server has lots of regions to open and its taking time? Is the meta table being beat up? Could this be holding up region opens? Good luck V, St.Ack On Fri, Aug 27, 2010 at 5:01 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: Hi guys, A couple of days back, I had posted a problem on regions taking too much time to load when I restart Hbase.. I have a table that has around 80 K regions on 650 nodes (!) .. I was checking the logs in the master and I notice that the time it takes from assigning a region to a region server to the point when it recognizes that the region is open in that server takes around 20-30 minutes! Apart from master being the bottleneck here, can you guys let me know what the other possible cases are as to why this may happen? Thank you Vidhya Below is an example for region with start key 003404803994 where the assignment takes place at 22:59 while the confirmation that it got open came at 23:19... 2010-08-27 22:59:02,642 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment of DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. is not valid; serverAddress=, startCode=0 unknown. 2010-08-27 22:59:02,643 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: Creating UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeCreated with state SyncConnected with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeCreated with path /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeCreated state:SyncConnected path:/hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,645 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: b3130520.yst.yahoo.net,b3130560.yst.yahoo.net,b3130600.yst.yahoo.net,b 3130640.yst.yahoo.net,b3130680.yst.yahoo.net:/hbase,org.apache.hadoop.h base.master.HMasterCreated ZNode /hbase/UNASSIGNED/73c0f8fdb8ffbc20b9a239d325932ff1 in ZooKeeper 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created/updated UNASSIGNED zNode DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. in state M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: Got event type [ M2ZK_REGION_OFFLINE ] for region 73c0f8fdb8ffbc20b9a239d325932ff1 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.HMaster: Event NodeChildrenChanged with state SyncConnected with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKMasterAddressWatcher: Got event NodeChildrenChanged with path /hbase/UNASSIGNED 2010-08-27 22:59:02,646 DEBUG org.apache.hadoop.hbase.master.ZKUnassignedWatcher: ZK-EVENT-PROCESS: Got zkEvent NodeChildrenChanged state:SyncConnected path:/hbase/UNASSIGNED 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning for serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993): total nregions to assign=1, regions to give other servers than this=0, isMetaAssign=false 2010-08-27 22:59:02,647 DEBUG org.apache.hadoop.hbase.master.RegionManager: Assigning serverName=b3130020.yst.yahoo.net,60020,1282940735627, load=(requests=0, regions=76, usedHeap=80, maxHeap=7993) 1 regions 2010-08-27 22:59:02,647 INFO org.apache.hadoop.hbase.master.RegionManager: Assigning region DocDB,003404803994,1282947439133.73c0f8fdb8ffbc20b9a239d325932ff1. to b3130020.yst.yahoo.net,60020,1282940735627 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.zookeeper.ZooKeeperWrapper: No need to update UNASSIGNED region 73c0f8fdb8ffbc20b9a239d325932ff1 as it already exists in state = M2ZK_REGION_OFFLINE 2010-08-27 22:59:02,653 DEBUG org.apache.hadoop.hbase.master.RegionManager: Created UNASSIGNED zNode
regionserver skew
I have a few questions related to reading from hbase - 1. How can I detect a regionserver skew. In other words, one regionserver is being hit more than the others ? When I look at the master log, it states org.apache.hadoop.hbase.master.ServerManager: 3 region servers, 0 dead, average load 23.668 Does that mean that the load is balanced? And in case it is not, do I need to redesign or reload my Hbase table ? any other options ? 2. Is it okay to have stargate running on more than one node in the cluster? I am using stargate and libcurl to read from Hbase and to speed this up, may be hitting different stargate servers could help ? Any cons to this? 3. Is there a way I can get more than one version of a row via stargate ? I tried the url with ?v=2 in the end, but it did not work. Thanks, Avani Sharma