[jira] [Updated] (HBASE-5671) hbase.metrics.showTableName should be true by default
[ https://issues.apache.org/jira/browse/HBASE-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5671: - Fix Version/s: (was: 0.94.1) 0.94.0 hbase.metrics.showTableName should be true by default - Key: HBASE-5671 URL: https://issues.apache.org/jira/browse/HBASE-5671 Project: HBase Issue Type: Improvement Components: metrics Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5671_v1.patch HBASE-4768 added per-cf metrics and a new configuration option hbase.metrics.showTableName. We should switch the conf option to true by default, since it is not intuitive (at least to me) to aggregate per-cf across tables by default, and it seems confusing to report on cf's without table names. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5638) Backport to 0.90 and 0.92 - NPE reading ZK config in HBase
[ https://issues.apache.org/jira/browse/HBASE-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5638: - Fix Version/s: (was: 0.94.1) 0.96.0 Backport to 0.90 and 0.92 - NPE reading ZK config in HBase -- Key: HBASE-5638 URL: https://issues.apache.org/jira/browse/HBASE-5638 Project: HBase Issue Type: Sub-task Components: zookeeper Affects Versions: 0.90.6, 0.92.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5633-0.90.patch, HBASE-5633-0.92.patch, HBASE-5638-0.90-v1.patch, HBASE-5638-0.90-v2.patch, HBASE-5638-0.92-v1.patch, HBASE-5638-0.92-v2.patch, HBASE-5638-trunk-v1.patch, HBASE-5638-trunk-v2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243668#comment-13243668 ] xufeng commented on HBASE-5677: --- We can reproduce this issue by following steps with 0.90: step1:start a cluster and create a table that has many regions. step2:disable table created in step1 by shell. step3:kill the active master. step3:the backup master will become active one,when the master checkin regionservers. enable the table by shell. result:the duplicate problem issue happened. I think the master should not provide service when it did not complete the initialization. We can add a method in HMasterInterface like: {noformat} public boolean isMasterAvailable(); //the master is running and it can provide service public boolean isMasterAvailable() { return !isStopped() isActiveMaster() isInitialized(); } {noformat} When the client getMaster,we can check it. pls give me the suggestions,thanks. The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5671) hbase.metrics.showTableName should be true by default
[ https://issues.apache.org/jira/browse/HBASE-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5671: - Fix Version/s: (was: 0.96.0) hbase.metrics.showTableName should be true by default - Key: HBASE-5671 URL: https://issues.apache.org/jira/browse/HBASE-5671 Project: HBase Issue Type: Improvement Components: metrics Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.94.0 Attachments: HBASE-5671_v1.patch HBASE-4768 added per-cf metrics and a new configuration option hbase.metrics.showTableName. We should switch the conf option to true by default, since it is not intuitive (at least to me) to aggregate per-cf across tables by default, and it seems confusing to report on cf's without table names. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5671) hbase.metrics.showTableName should be true by default
[ https://issues.apache.org/jira/browse/HBASE-5671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5671: - Fix Version/s: 0.96.0 hbase.metrics.showTableName should be true by default - Key: HBASE-5671 URL: https://issues.apache.org/jira/browse/HBASE-5671 Project: HBase Issue Type: Improvement Components: metrics Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5671_v1.patch HBASE-4768 added per-cf metrics and a new configuration option hbase.metrics.showTableName. We should switch the conf option to true by default, since it is not intuitive (at least to me) to aggregate per-cf across tables by default, and it seems confusing to report on cf's without table names. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: HBASE-5666-v0.patch Patch attached to retry only on HRegionServer . Using hbase.basenode.avail.timeout as conf key. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v0.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log, zk-exists-refactor-v0.patch I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5213) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243673#comment-13243673 ] Hudson commented on HBASE-5213: --- Integrated in HBase-0.92 #348 (See [https://builds.apache.org/job/HBase-0.92/348/]) HBASE-5213 hbase master stop does not bring down backup masters (Gregory) (Revision 1308012) Result = SUCCESS jmhsieh : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/master/TestMasterShutdown.java hbase master stop does not bring down backup masters -- Key: HBASE-5213 URL: https://issues.apache.org/jira/browse/HBASE-5213 Project: HBase Issue Type: Bug Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5213-v0-trunk.patch, HBASE-5213-v1-trunk.patch, HBASE-5213-v2-90.patch, HBASE-5213-v2-92.patch, HBASE-5213-v2-trunk.patch Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Status: Patch Available (was: Open) RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v0.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log, zk-exists-refactor-v0.patch I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243675#comment-13243675 ] chunhui shen commented on HBASE-5689: - bq.Is a TreeMap needed above ? We're just remembering the mapping, right ? I first used ConcurrentHashMap, but for the same region name, they mapped to different values because of byte[] Skipping RecoveredEdits may cause data loss --- Key: HBASE-5689 URL: https://issues.apache.org/jira/browse/HBASE-5689 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.94.0 Attachments: 5689-simplified.txt, 5689-testcase.patch, HBASE-5689.patch Let's see the following scenario: 1.Region is on the server A 2.put KV(r1-v1) to the region 3.move region from server A to server B 4.put KV(r2-v2) to the region 5.move region from server B to server A 6.put KV(r3-v3) to the region 7.kill -9 server B and start it 8.kill -9 server A and start it 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third KV(r3-v3) is lost. Let's analyse the upper scenario from the code: 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same hlog file on server A. 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. 3.however, RecoveredEdits file f2 will be skiped when initializing region HRegion#replayRecoveredEditsIfAny {code} for (Path edits: files) { if (edits == null || !this.fs.exists(edits)) { LOG.warn(Null or non-existent edits file: + edits); continue; } if (isZeroLengthThenDelete(this.fs, edits)) continue; if (checkSafeToSkip) { Path higher = files.higher(edits); long maxSeqId = Long.MAX_VALUE; if (higher != null) { // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+ String fileName = higher.getName(); maxSeqId = Math.abs(Long.parseLong(fileName)); } if (maxSeqId = minSeqId) { String msg = Maximum possible sequenceid for this log is + maxSeqId + , skipped the whole file, path= + edits; LOG.debug(msg); continue; } else { checkSafeToSkip = false; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5693: --- Attachment: 5693.v1.patch When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nkeywal updated HBASE-5693: --- Status: Patch Available (was: Open) When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: (was: zk-exists-refactor-v0.patch) RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: (was: HBASE-5666-v0.patch) RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: HBASE-5666-v1.patch RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243743#comment-13243743 ] Ted Yu commented on HBASE-5689: --- Using a TreeMap is common practice. Please attach test suite result - Hadoop QA is not working. Skipping RecoveredEdits may cause data loss --- Key: HBASE-5689 URL: https://issues.apache.org/jira/browse/HBASE-5689 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.94.0 Attachments: 5689-simplified.txt, 5689-testcase.patch, HBASE-5689.patch Let's see the following scenario: 1.Region is on the server A 2.put KV(r1-v1) to the region 3.move region from server A to server B 4.put KV(r2-v2) to the region 5.move region from server B to server A 6.put KV(r3-v3) to the region 7.kill -9 server B and start it 8.kill -9 server A and start it 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third KV(r3-v3) is lost. Let's analyse the upper scenario from the code: 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same hlog file on server A. 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. 3.however, RecoveredEdits file f2 will be skiped when initializing region HRegion#replayRecoveredEditsIfAny {code} for (Path edits: files) { if (edits == null || !this.fs.exists(edits)) { LOG.warn(Null or non-existent edits file: + edits); continue; } if (isZeroLengthThenDelete(this.fs, edits)) continue; if (checkSafeToSkip) { Path higher = files.higher(edits); long maxSeqId = Long.MAX_VALUE; if (higher != null) { // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+ String fileName = higher.getName(); maxSeqId = Math.abs(Long.parseLong(fileName)); } if (maxSeqId = minSeqId) { String msg = Maximum possible sequenceid for this log is + maxSeqId + , skipped the whole file, path= + edits; LOG.debug(msg); continue; } else { checkSafeToSkip = false; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4348) Add metrics for regions in transition
[ https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243745#comment-13243745 ] jirapos...@reviews.apache.org commented on HBASE-4348: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4402/#review6604 --- src/main/java/org/apache/hadoop/hbase/HConstants.java https://reviews.apache.org/r/4402/#comment14269 I think the trailing '.time' isn't needed. Take a look at existing config parameter names involving threshold: {code} this.thresholdIdleConnections = conf.getInt(ipc.client.idlethreshold, 4000); src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java putsortreducer.row.threshold, 2L * (130)); src/main/java/org/apache/hadoop/hbase/mapreduce/PutSortReducer.java {code} src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/4402/#comment14270 Please add curly braces around the following line. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java https://reviews.apache.org/r/4402/#comment14271 Lift this line to line 2733. src/main/java/org/apache/hadoop/hbase/master/HMaster.java https://reviews.apache.org/r/4402/#comment14272 'out' isn't needed here. It would be nice to combine this sentence into the comment for this method. - Ted On 2012-03-30 05:21:12, Himanshu Vashishtha wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4402/ bq. --- bq. bq. (Updated 2012-03-30 05:21:12) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch is for adding Region in transition metrics to the HMaster metrics system. It also adds these metrics in the master ui, in the Region in transition section. I have attached the proposed new format in the jira 4348. bq. bq. bq. This addresses bug HBase-4348. bq. https://issues.apache.org/jira/browse/HBase-4348 bq. bq. bq. Diffs bq. - bq. bq. src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon 0dc0691 bq.src/main/java/org/apache/hadoop/hbase/HConstants.java 21ac4ba bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 64def15 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9bd4ace bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 83abc52 bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 91dce36 bq. bq. Diff: https://reviews.apache.org/r/4402/diff bq. bq. bq. Testing bq. --- bq. bq. Ran on a 5 node cluster and kill region servers randomly to observe the changes in the RIT metrics as emitted out by the Master's mxbean; bq. bq. mvn test passes without any failure. bq. bq. bq. Thanks, bq. bq. Himanshu bq. bq. Add metrics for regions in transition - Key: HBASE-4348 URL: https://issues.apache.org/jira/browse/HBASE-4348 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Himanshu Vashishtha Priority: Minor Labels: noob Fix For: 0.96.0 Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, RITs.png, RegionInTransitions2.png, metrics-v2.patch The following metrics would be useful for monitoring the master: - the number of regions in transition - the number of regions in transition that have been in transition for more than a minute - how many seconds has the oldest region-in-transition been in transition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243746#comment-13243746 ] Ted Yu commented on HBASE-5693: --- CreateTableHandler isn't initializing the regions. Who will initialize them ? When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5677) The master never does balance because duplicate openhandled the one region
[ https://issues.apache.org/jira/browse/HBASE-5677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243747#comment-13243747 ] Ted Yu commented on HBASE-5677: --- Interesting. Chunhui proposed safe mode for Master in HBASE-5270. See https://issues.apache.org/jira/browse/HBASE-5270?focusedCommentId=13214394page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13214394 Can you verify that this issue has been fixed in 0.92.2 ? Thanks The master never does balance because duplicate openhandled the one region -- Key: HBASE-5677 URL: https://issues.apache.org/jira/browse/HBASE-5677 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90 Reporter: xufeng Assignee: xufeng If region be assigned When the master is doing initialization(before do processFailover),the region will be duplicate openhandled. because the unassigned node in zookeeper will be handled again in AssignmentManager#processFailover() it cause the region in RIT,thus the master never does balance. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243751#comment-13243751 ] nkeywal commented on HBASE-5693: I didn't look very far in the code. CreateTableHandler is executed on the master. It does not need to initialize the memstore so on. The underlying method is called from the region server as well; and here the initialization code is called. May be there is some thing more complex I didn't see, but at least all the unit tests went well. On Sun, Apr 1, 2012 at 5:28 PM, Ted Yu (Commented) (JIRA) When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243762#comment-13243762 ] Ted Yu commented on HBASE-5693: --- It is called from OpenRegionHandler.openRegion() I once made some threads daemon which passed unit tests but resulted in master and region server failing to start. Testing on a real cluster is desirable. When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243765#comment-13243765 ] Ted Yu commented on HBASE-5693: --- @N: Can you rebased the patch for trunk ? {code} Hunk #3 FAILED at 3613. 1 out of 3 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java.rej {code} When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243769#comment-13243769 ] jirapos...@reviews.apache.org commented on HBASE-5688: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4600/#review6605 --- Looks good to me. - Jimmy On 2012-04-01 00:18:54, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4600/ bq. --- bq. bq. (Updated 2012-04-01 00:18:54) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Changes the content of the root location znode, root-region-server, to be bq. four magic bytes ('PBUF') followed by a protobuf message that holds the bq. ServerName of the server currently hosting root. bq. bq. D src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java bq.Removed. Had two methods, one to add root-region-server znode and another bq.to removed it. Rather, put these methods in RootRegionTracker. It bq.tracks root-region-server znode. Having all to do w/ root-region-server bq.is more cohesive. Also makes it so can encapsulate in one class bq.all to do w/ create, delete, and reading of root-region-server. bq.We also want to purge the catalog package (See note at head of bq.CatalogTracker). bq. M src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java bq. M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java bq. M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java bq.Get root region location from RootRegionTracker rather than from RootLocationEditor. bq. A src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java bq.Utility to do w/ protobuf handling. Has methods to help prefixing bq.and stripping from serialized protobuf messages some 'magic'. bq. A src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java bq.PB generated. bq. M src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java bq.Use new RootRegionTracker method for getting content of znode rather bq.than do it all here (going via RootRegionTracker, we can keep how bq.the znode content is serialized private to the RootRegionTracker class. bq. M src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java bq.Has the methods that used to be in RootLocationEditor plus a new bq. bq. bq. This addresses bug hbase-5688. bq. https://issues.apache.org/jira/browse/hbase-5688 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java c90864a bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b2a5463 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 64def15 bq.src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 9c215b4 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 2f05005 bq.src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 33e4e71 bq.src/main/protobuf/ZooKeeper.proto PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 533b2bf bq. src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTrackerOnCluster.java fe37156 bq.src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java 2132036 bq. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4600/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. Convert zk root-region-server znode content to pb - Key: HBASE-5688 URL: https://issues.apache.org/jira/browse/HBASE-5688 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.96.0 Attachments: 5688.txt, 5688v4.txt Move the root-region-server znode content from the versioned bytes that ServerName.getVersionedBytes outputs to instead be pb. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on
[jira] [Commented] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table
[ https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243771#comment-13243771 ] Matteo Bertozzi commented on HBASE-5665: Can we also add a couple of methods to the region like isSplittable() and isAvailable() {code} boolean isAvailable() { return !isClosed() !isClosing(); } boolean isSplittable() { return isAvailable() !hasReferences(); } {code} just to avoid similar problems in future... For example in HRegionServer both getMostLoadedRegions() and closeUserRegions() does the same isAvailable() check... Repeated split causes HRegionServer failures and breaks table -- Key: HBASE-5665 URL: https://issues.apache.org/jira/browse/HBASE-5665 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Priority: Blocker Attachments: HBASE-5665-0.92.patch Repeated splits on large tables (2 consecutive would suffice) will essentially break the table (and the cluster), unrecoverable. The regionserver doing the split dies and the master will get into an infinite loop trying to assign regions that seem to have the files missing from HDFS. The table can be disabled once. upon trying to re-enable it, it will remain in an intermediary state forever. I was able to reproduce this on a smaller table consistently. {code} hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'} hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}} {code} Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will reproduce the issue almost instantly and consistently. {code} 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in META 2012-03-28 10:57:16,321 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1.. compaction_queue=(0:1), split_queue=10 2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 java.io.IOException: Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008) at org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484) ... 1 more 2012-03-28 10:57:16,345 FATAL
[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243774#comment-13243774 ] jirapos...@reviews.apache.org commented on HBASE-5688: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4600/#review6606 --- src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java https://reviews.apache.org/r/4600/#comment14273 I think prefixedWithPBMagic would be a better name for this method. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java https://reviews.apache.org/r/4600/#comment14274 Javadoc would be desirable. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java https://reviews.apache.org/r/4600/#comment14275 White space. - Ted On 2012-04-01 00:18:54, Michael Stack wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4600/ bq. --- bq. bq. (Updated 2012-04-01 00:18:54) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. Changes the content of the root location znode, root-region-server, to be bq. four magic bytes ('PBUF') followed by a protobuf message that holds the bq. ServerName of the server currently hosting root. bq. bq. D src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java bq.Removed. Had two methods, one to add root-region-server znode and another bq.to removed it. Rather, put these methods in RootRegionTracker. It bq.tracks root-region-server znode. Having all to do w/ root-region-server bq.is more cohesive. Also makes it so can encapsulate in one class bq.all to do w/ create, delete, and reading of root-region-server. bq.We also want to purge the catalog package (See note at head of bq.CatalogTracker). bq. M src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java bq. M src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java bq. M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java bq.Get root region location from RootRegionTracker rather than from RootLocationEditor. bq. A src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java bq.Utility to do w/ protobuf handling. Has methods to help prefixing bq.and stripping from serialized protobuf messages some 'magic'. bq. A src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java bq.PB generated. bq. M src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java bq.Use new RootRegionTracker method for getting content of znode rather bq.than do it all here (going via RootRegionTracker, we can keep how bq.the znode content is serialized private to the RootRegionTracker class. bq. M src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java bq.Has the methods that used to be in RootLocationEditor plus a new bq. bq. bq. This addresses bug hbase-5688. bq. https://issues.apache.org/jira/browse/hbase-5688 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/catalog/RootLocationEditor.java c90864a bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java b2a5463 bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 64def15 bq.src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java PRE-CREATION bq. src/main/java/org/apache/hadoop/hbase/protobuf/generated/ZooKeeperProtos.java PRE-CREATION bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 9c215b4 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 2f05005 bq.src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java 33e4e71 bq.src/main/protobuf/ZooKeeper.proto PRE-CREATION bq.src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 533b2bf bq. src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTrackerOnCluster.java fe37156 bq.src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java 2132036 bq. src/test/java/org/apache/hadoop/hbase/zookeeper/TestRootRegionTracker.java PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4600/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Michael bq. bq. Convert zk root-region-server znode content to pb - Key: HBASE-5688 URL: https://issues.apache.org/jira/browse/HBASE-5688 Project: HBase Issue Type: Task Reporter:
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243775#comment-13243775 ] nkeywal commented on HBASE-5693: Ok, I will do that + a test on a real cluster. On Sun, Apr 1, 2012 at 6:12 PM, Ted Yu (Commented) (JIRA) When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4348) Add metrics for regions in transition
[ https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243798#comment-13243798 ] jirapos...@reviews.apache.org commented on HBASE-4348: -- bq. On 2012-04-01 15:13:59, Ted Yu wrote: bq. src/main/java/org/apache/hadoop/hbase/HConstants.java, line 655 bq. https://reviews.apache.org/r/4402/diff/6/?file=97548#file97548line655 bq. bq. I think the trailing '.time' isn't needed. Take a look at existing config parameter names involving threshold: bq. {code} bq. this.thresholdIdleConnections = conf.getInt(ipc.client.idlethreshold, 4000); bq. src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java bq. putsortreducer.row.threshold, 2L * (130)); bq. src/main/java/org/apache/hadoop/hbase/mapreduce/PutSortReducer.java bq. {code} bq. Actually, looking at the metric name without context, hbase.metrics.rit.threshold makes me think this is a count of the number of max regions in transition. With the .time suffix, it makes me think it is the max time for an RIT which also isn't quite right. If all things are in millis than we probably don't need units but it doesn't hurt IMO. What do you think of something like: hbase.metrics.rit.refresh.millis, hbase.metrics.rit.refresh.threshold.millis, or hbase.metrics.rit.refresh.threshold? - jmhsieh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4402/#review6604 --- On 2012-03-30 05:21:12, Himanshu Vashishtha wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4402/ bq. --- bq. bq. (Updated 2012-03-30 05:21:12) bq. bq. bq. Review request for hbase. bq. bq. bq. Summary bq. --- bq. bq. This patch is for adding Region in transition metrics to the HMaster metrics system. It also adds these metrics in the master ui, in the Region in transition section. I have attached the proposed new format in the jira 4348. bq. bq. bq. This addresses bug HBase-4348. bq. https://issues.apache.org/jira/browse/HBase-4348 bq. bq. bq. Diffs bq. - bq. bq. src/main/jamon/org/apache/hadoop/hbase/tmpl/master/AssignmentManagerStatusTmpl.jamon 0dc0691 bq.src/main/java/org/apache/hadoop/hbase/HConstants.java 21ac4ba bq.src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java 64def15 bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java 9bd4ace bq.src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetrics.java 83abc52 bq.src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java 91dce36 bq. bq. Diff: https://reviews.apache.org/r/4402/diff bq. bq. bq. Testing bq. --- bq. bq. Ran on a 5 node cluster and kill region servers randomly to observe the changes in the RIT metrics as emitted out by the Master's mxbean; bq. bq. mvn test passes without any failure. bq. bq. bq. Thanks, bq. bq. Himanshu bq. bq. Add metrics for regions in transition - Key: HBASE-4348 URL: https://issues.apache.org/jira/browse/HBASE-4348 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Himanshu Vashishtha Priority: Minor Labels: noob Fix For: 0.96.0 Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, RITs.png, RegionInTransitions2.png, metrics-v2.patch The following metrics would be useful for monitoring the master: - the number of regions in transition - the number of regions in transition that have been in transition for more than a minute - how many seconds has the oldest region-in-transition been in transition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4348) Add metrics for regions in transition
[ https://issues.apache.org/jira/browse/HBASE-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243801#comment-13243801 ] Jonathan Hsieh commented on HBASE-4348: --- @Otis I generally like the policy of setting versions on commit, or having a release manager set it if they decide it is necessary for a release. Add metrics for regions in transition - Key: HBASE-4348 URL: https://issues.apache.org/jira/browse/HBASE-4348 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Himanshu Vashishtha Priority: Minor Labels: noob Fix For: 0.96.0 Attachments: 4348-metrics-v3.patch, 4348-v1.patch, 4348-v2.patch, RITs.png, RegionInTransitions2.png, metrics-v2.patch The following metrics would be useful for monitoring the master: - the number of regions in transition - the number of regions in transition that have been in transition for more than a minute - how many seconds has the oldest region-in-transition been in transition -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243836#comment-13243836 ] Hadoop QA commented on HBASE-5693: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520822/5693.v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1358//console This message is automatically generated. When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table
[ https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5665: --- Attachment: HBASE-5665-trunk.patch Repeated split causes HRegionServer failures and breaks table -- Key: HBASE-5665 URL: https://issues.apache.org/jira/browse/HBASE-5665 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Priority: Blocker Attachments: HBASE-5665-0.92.patch, HBASE-5665-trunk.patch Repeated splits on large tables (2 consecutive would suffice) will essentially break the table (and the cluster), unrecoverable. The regionserver doing the split dies and the master will get into an infinite loop trying to assign regions that seem to have the files missing from HDFS. The table can be disabled once. upon trying to re-enable it, it will remain in an intermediary state forever. I was able to reproduce this on a smaller table consistently. {code} hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'} hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}} {code} Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will reproduce the issue almost instantly and consistently. {code} 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in META 2012-03-28 10:57:16,321 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1.. compaction_queue=(0:1), split_queue=10 2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 java.io.IOException: Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008) at org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484) ... 1 more 2012-03-28 10:57:16,345 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server ld2,60020,1332957343833: Abort; we got an error after point-of-no-return {code} http://hastebin.com/diqinibajo.avrasm later edit: (I'm using the last 4 characters from each string) Region 94e3 has storefile 7237 Region 94e3 gets splited in daughters a: ffa1 and b: eee1 Daughter region ffa1 get's splitted in daughters a: 3124 and b: dc77
[jira] [Updated] (HBASE-4393) Implement a canary monitoring program
[ https://issues.apache.org/jira/browse/HBASE-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-4393: --- Status: Patch Available (was: Open) Implement a canary monitoring program - Key: HBASE-4393 URL: https://issues.apache.org/jira/browse/HBASE-4393 Project: HBase Issue Type: New Feature Components: monitoring Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Matteo Bertozzi Attachments: Canary-v0.java, HBaseCanary.java This JIRA is to implement a standalone program that can be used to do canary monitoring of a running HBase cluster. This program would gather a list of the regions in the cluster, then iterate over them doing lightweight operations (eg short scans) to provide metrics about latency as well as alert on availability issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table
[ https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243842#comment-13243842 ] Ted Yu commented on HBASE-5665: --- HBASE-5665-trunk.patch looks good. Repeated split causes HRegionServer failures and breaks table -- Key: HBASE-5665 URL: https://issues.apache.org/jira/browse/HBASE-5665 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Priority: Blocker Attachments: HBASE-5665-0.92.patch, HBASE-5665-trunk.patch Repeated splits on large tables (2 consecutive would suffice) will essentially break the table (and the cluster), unrecoverable. The regionserver doing the split dies and the master will get into an infinite loop trying to assign regions that seem to have the files missing from HDFS. The table can be disabled once. upon trying to re-enable it, it will remain in an intermediary state forever. I was able to reproduce this on a smaller table consistently. {code} hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'} hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}} {code} Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will reproduce the issue almost instantly and consistently. {code} 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in META 2012-03-28 10:57:16,321 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1.. compaction_queue=(0:1), split_queue=10 2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 java.io.IOException: Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008) at org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:284) at org.apache.hadoop.hbase.regionserver.Store.init(Store.java:221) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2511) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:450) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3229) at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughterRegion(SplitTransaction.java:504) at org.apache.hadoop.hbase.regionserver.SplitTransaction$DaughterOpener.run(SplitTransaction.java:484) ... 1 more 2012-03-28 10:57:16,345 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server ld2,60020,1332957343833: Abort; we got an error after point-of-no-return {code} http://hastebin.com/diqinibajo.avrasm later edit: (I'm using the last 4 characters from each string) Region 94e3 has storefile 7237 Region 94e3 gets splited in daughters a: ffa1 and b: eee1 Daughter region ffa1
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243845#comment-13243845 ] Ted Yu commented on HBASE-5666: --- {code} +if (keeperEx != null) + throw keeperEx; {code} Please either lift the throw to the same line as if or add curly braces. {code} +checkExists(zk, parentZNode, maxTimeMs); +LOG.info(Parent znode exists: + parentZNode); {code} If checkExists() returns -1, would the log statement still be true ? RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-5681) Split Region crash if region is still offline after a previous split
[ https://issues.apache.org/jira/browse/HBASE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi resolved HBASE-5681. Resolution: Duplicate Duplicate of HBASE-5665, trying to split the parent region, hasReferences() is true and split shouldn't be done. Split Region crash if region is still offline after a previous split Key: HBASE-5681 URL: https://issues.apache.org/jira/browse/HBASE-5681 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Affects Versions: 0.92.1, 0.96.0, 0.94.1 Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Priority: Minor Attachments: logs0-HBASE-5681.tar.bz2, logs1-HBASE-5681.tar.bz2 I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) due to HBASE-5666 I need a sleep to ensure that rs are up. {code} $HBASE_HOME/bin/start-hbase.sh sleep 5 # bug HBASE-5666 rs doesn't retry if znode is not available. $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} Once hbase is started I run an hbase shell script file (see below) everything is fine till last split operation. {code} # $HBASE_HOME/bin/hbase shell test.hbase # test.hbase create 'bugtb-t1', 'tcf11', 'tcf12' create 'bugtb-t2', 'tcf11', 'tcf12' put 'bugtb-t1', '10', 'tcf11:c1', 'a' put 'bugtb-t1', '15', 'tcf11:c2', 'b' put 'bugtb-t1', '20', 'tcf11:c1', 'c' put 'bugtb-t1', '30', 'tcf11:c2', 'd' put 'bugtb-t1', '35', 'tcf11:c1', 'e' put 'bugtb-t1', '40', 'tcf11:c2', 'f' put 'bugtb-t2', '10', 'tcf11:c1', 'a' put 'bugtb-t2', '15', 'tcf11:c2', 'b' put 'bugtb-t2', '20', 'tcf11:c1', 'c' put 'bugtb-t2', '30', 'tcf11:c2', 'd' put 'bugtb-t2', '35', 'tcf11:c1', 'e' put 'bugtb-t2', '40', 'tcf11:c2', 'f' split 'bugtb-t1', '20' split 'bugtb-t2', '20' split 'bugtb-t1', '40' {code} During the last split the region is still offline, and you get an exception (If you sleep a bit before executing the last split, everything is fine) {code} ERROR: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.NotServingRegionException: Region is not online: bugtb-t1,,1333134892936.4e14c2cf4293156d5b099dc3d5c44890. at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3123) at org.apache.hadoop.hbase.regionserver.HRegionServer.splitRegion(HRegionServer.java:2926) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1383) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5693) When creating a region, the master initializes it and creates a memstore within the master server
[ https://issues.apache.org/jira/browse/HBASE-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243853#comment-13243853 ] stack commented on HBASE-5693: -- +1 on patch. Its silly we initialize the region over on master on creation. bq. CreateTableHandler isn't initializing the regions. Who will initialize them ? The regionserver when its assigned a region. bq. May be there is some thing more complex I didn't see, but at least all the unit tests went well. Nothing complex here. Regards test on a real cluster, not necessary on a patch this small. Unit tests run clusters anyways. Please make a patch that applies N and run it by hadoopqa. Thanks. When creating a region, the master initializes it and creates a memstore within the master server - Key: HBASE-5693 URL: https://issues.apache.org/jira/browse/HBASE-5693 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 5693.v1.patch I didn't do a complete analysis, but the attached patch saves more than 0.25s for each region creation and locally all the unit tests work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5688) Convert zk root-region-server znode content to pb
[ https://issues.apache.org/jira/browse/HBASE-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243856#comment-13243856 ] Hadoop QA commented on HBASE-5688: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520807/5688v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.mapreduce.TestImportTsv org.apache.hadoop.hbase.mapred.TestTableMapReduce org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1359//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1359//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1359//console This message is automatically generated. Convert zk root-region-server znode content to pb - Key: HBASE-5688 URL: https://issues.apache.org/jira/browse/HBASE-5688 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Fix For: 0.96.0 Attachments: 5688.txt, 5688v4.txt Move the root-region-server znode content from the versioned bytes that ServerName.getVersionedBytes outputs to instead be pb. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243855#comment-13243855 ] Matteo Bertozzi commented on HBASE-5666: @Ted woo good catch I've just translated the method without thinking... and this simplified version emphasizes a problem already present in the previous version. If you take a look at the original version, (the LOG.info is under if, ok) but what happens if the method return and the znode is not available? no exception is raised... but I think that the caller of waitForXyz() expect some exception in case of timeout, in the other case the value that I'm looking for must be present... (this function is just called by one test) RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-5666: --- Attachment: HBASE-5666-v2.patch RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243863#comment-13243863 ] Ted Yu commented on HBASE-5666: --- Patch v2 looks good. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243871#comment-13243871 ] Hadoop QA commented on HBASE-5666: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520825/HBASE-5666-v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.TestHBaseFsck org.apache.hadoop.hbase.TestZooKeeper Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1360//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1360//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1360//console This message is automatically generated. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wouter Bolsterlee updated HBASE-5694: - Status: Patch Available (was: Open) Trivial patch to fix the reported issue. getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243873#comment-13243873 ] Wouter Bolsterlee commented on HBASE-5694: -- For some reason, JIRA doesn't accept my patch file in the upload dialog. Here it is: --- ThriftServer.java.orig 2012-04-01 23:41:16.881172406 +0200 +++ ThriftServer.java 2012-04-01 23:41:30.177238337 +0200 @@ -477,8 +477,8 @@ get.addColumn(famAndQf[0], famAndQf[1]); } } -get.setTimeRange(Long.MIN_VALUE, timestamp); } + get.setTimeRange(Long.MIN_VALUE, timestamp); gets.add(get); } Result[] result = table.get(gets); getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5682) Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only)
[ https://issues.apache.org/jira/browse/HBASE-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243874#comment-13243874 ] stack commented on HBASE-5682: -- On commit, change this '+ LOG.debug(Abort, t);' to include the passed in msg? Else, +1 on the patch. Let me ask N if he thinks TRUNK can pick up anything from this patch (maybe his keepalive should do this auto-reconnect but maybe it doesn't need it). What were you doing w/ it was taking a long time to recover? Allow HConnectionImplementation to recover from ZK connection loss (for 0.94 only) -- Key: HBASE-5682 URL: https://issues.apache.org/jira/browse/HBASE-5682 Project: HBase Issue Type: Improvement Components: client Reporter: Lars Hofhansl Assignee: Lars Hofhansl Priority: Critical Fix For: 0.94.0 Attachments: 5682-all-v2.txt, 5682-all.txt, 5682-v2.txt, 5682.txt Just realized that without this HBASE-4805 is broken. I.e. there's no point keeping a persistent HConnection around if it can be rendered permanently unusable if the ZK connection is lost temporarily. Note that this is fixed in 0.96 with HBASE-5399 (but that seems to big to backport) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs function Thrift service incorrectly handles time range
[ https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wouter Bolsterlee updated HBASE-5694: - Attachment: HBASE-5694.patch Okay, here's the patch. For some reason it works in Firefox, but not in Epiphany. Explanation for the patch: set time range regardless of column specification, making the time range actually work when no columns are specified. getRowsWithColumnsTs function Thrift service incorrectly handles time range --- Key: HBASE-5694 URL: https://issues.apache.org/jira/browse/HBASE-5694 Project: HBase Issue Type: Bug Components: thrift Affects Versions: 0.92.1 Reporter: Wouter Bolsterlee Fix For: 0.92.2 Attachments: HBASE-5694.patch The getRowsWithColumnsTs() method in the Thrift interface only applies the timestamp if columns are explicitly specified. However, this method also allows for columns to be unspecified (this is even used internally to implement e.g. getRows()). The cause of the bug is a minor scoping issue: the time range is set inside a wrong if statement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5663) MultithreadedTableMapper doesn't work.
[ https://issues.apache.org/jira/browse/HBASE-5663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243882#comment-13243882 ] Hadoop QA commented on HBASE-5663: -- +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520786/5663%2B5636.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 12 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1361//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1361//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1361//console This message is automatically generated. MultithreadedTableMapper doesn't work. -- Key: HBASE-5663 URL: https://issues.apache.org/jira/browse/HBASE-5663 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.94.0 Reporter: Takuya Ueshin Assignee: Takuya Ueshin Fix For: 0.94.0, 0.96.0 Attachments: 5663+5636.txt, HBASE-5663.patch MapReduce job using MultithreadedTableMapper goes down throwing the following Exception: {noformat} java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration, org.apache.hadoop.mapred.TaskAttemptID, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter, org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter, org.apache.hadoop.hbase.mapreduce.TableSplit) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:260) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper.run(MultithreadedTableMapper.java:133) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.NoSuchMethodException: org.apache.hadoop.mapreduce.Mapper$Context.init(org.apache.hadoop.conf.Configuration, org.apache.hadoop.mapred.TaskAttemptID, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordReader, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapRecordWriter, org.apache.hadoop.hbase.mapreduce.TableOutputCommitter, org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$SubMapStatusReporter, org.apache.hadoop.hbase.mapreduce.TableSplit) at java.lang.Class.getConstructor0(Class.java:2706) at java.lang.Class.getConstructor(Class.java:1657) at org.apache.hadoop.hbase.mapreduce.MultithreadedTableMapper$MapRunner.init(MultithreadedTableMapper.java:241) ... 8 more {noformat} This occured when the tasks are creating MapRunner threads. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5665) Repeated split causes HRegionServer failures and breaks table
[ https://issues.apache.org/jira/browse/HBASE-5665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243898#comment-13243898 ] Hadoop QA commented on HBASE-5665: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520847/HBASE-5665-trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1362//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1362//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1362//console This message is automatically generated. Repeated split causes HRegionServer failures and breaks table -- Key: HBASE-5665 URL: https://issues.apache.org/jira/browse/HBASE-5665 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.92.0, 0.92.1 Reporter: Cosmin Lehene Assignee: Cosmin Lehene Priority: Blocker Attachments: HBASE-5665-0.92.patch, HBASE-5665-trunk.patch Repeated splits on large tables (2 consecutive would suffice) will essentially break the table (and the cluster), unrecoverable. The regionserver doing the split dies and the master will get into an infinite loop trying to assign regions that seem to have the files missing from HDFS. The table can be disabled once. upon trying to re-enable it, it will remain in an intermediary state forever. I was able to reproduce this on a smaller table consistently. {code} hbase(main):030:0 (0..1).each{|x| put 't1', #{x}, 'f1:t', 'dd'} hbase(main):030:0 (0..1000).each{|x| split 't1', #{x*10}} {code} Running overlapping splits in parallel (e.g. #{x*10+1}, #{x*10+2}... ) will reproduce the issue almost instantly and consistently. {code} 2012-03-28 10:57:16,320 INFO org.apache.hadoop.hbase.catalog.MetaEditor: Offlined parent region t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1. in META 2012-03-28 10:57:16,321 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread: Split requested for t1,5,1332957435767.648d30de55a5cec6fc2f56dcb3c7eee1.. compaction_queue=(0:1), split_queue=10 2012-03-28 10:57:16,343 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of t1,,1332957435767.2fb0473f4e71339e88dab0ee0d4dffa1.; Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 java.io.IOException: Failed ld2,60020,1332957343833-daughterOpener=2469c5650ea2aeed631eb85d3cdc3124 at org.apache.hadoop.hbase.regionserver.SplitTransaction.openDaughters(SplitTransaction.java:363) at org.apache.hadoop.hbase.regionserver.SplitTransaction.execute(SplitTransaction.java:451) at org.apache.hadoop.hbase.regionserver.SplitRequest.run(SplitRequest.java:67) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.FileNotFoundException: File does not exist: /hbase/t1/589c44cabba419c6ad8c9b427e5894e3.2fb0473f4e71339e88dab0ee0d4dffa1/f1/d62a852c25ad44e09518e102ca557237 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1822) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.init(DFSClient.java:1813) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:544) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:187) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:456) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:341) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.init(StoreFile.java:1008) at org.apache.hadoop.hbase.io.HalfStoreFileReader.init(HalfStoreFileReader.java:65) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:467) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:548) at
[jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
[ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243902#comment-13243902 ] Hadoop QA commented on HBASE-5666: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12520850/HBASE-5666-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestZooKeeper Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1363//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1363//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1363//console This message is automatically generated. RegionServer doesn't retry to check if base node is available - Key: HBASE-5666 URL: https://issues.apache.org/jira/browse/HBASE-5666 Project: HBase Issue Type: Bug Components: regionserver, zookeeper Reporter: Matteo Bertozzi Assignee: Matteo Bertozzi Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, hbase-1-regionserver.log, hbase-2-regionserver.log, hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed = true) {code} $HBASE_HOME/bin/start-hbase.sh $HBASE_HOME/bin/local-regionservers.sh start 1 2 3 {code} but the region servers are not able to start... It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper() check just once if the base not is available. {code} 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the one configured in the master. 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server localhost,60202,133296824: Initialization of RS failed. Hence aborting RS. java.io.IOException: Received the shutdown message while waiting. at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626) at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596) at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5696) Use Hadoop's DataOutputOutputStream instead of have a copy local
[ https://issues.apache.org/jira/browse/HBASE-5696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243917#comment-13243917 ] stack commented on HBASE-5696: -- We have a DOOS and so does hadoop. If I diff them, the hadoop one is public where ours is not (but a patch that is coming in also makes our's public, the protobuf hbase-5451. Use Hadoop's DataOutputOutputStream instead of have a copy local Key: HBASE-5696 URL: https://issues.apache.org/jira/browse/HBASE-5696 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5696) Use Hadoop's DataOutputOutputStream instead of have a copy local
Use Hadoop's DataOutputOutputStream instead of have a copy local Key: HBASE-5696 URL: https://issues.apache.org/jira/browse/HBASE-5696 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5695) Use Hadoop's DataOutputOutputStream instead of have a copy local
Use Hadoop's DataOutputOutputStream instead of have a copy local Key: HBASE-5695 URL: https://issues.apache.org/jira/browse/HBASE-5695 Project: HBase Issue Type: Improvement Reporter: stack -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5451) Switch RPC call envelope/headers to PBs
[ https://issues.apache.org/jira/browse/HBASE-5451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243919#comment-13243919 ] jirapos...@reviews.apache.org commented on HBASE-5451: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4096/#review6613 --- Some more questions. Just being careful DD. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java https://reviews.apache.org/r/4096/#comment14285 We should just be using the hadoop DOOS... looks like no diff (when I diff them). I'll make an issue to remove. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java https://reviews.apache.org/r/4096/#comment14278 Is this written up anywhere? That its hrpc, then version, then a length, then a protobuf? I see it in the proto definition. That'll do. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java https://reviews.apache.org/r/4096/#comment14279 We have an issue for removing this Invocation stuff? http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto https://reviews.apache.org/r/4096/#comment14280 Should we just remove them in the next iteration on rpc since 0.96 is to be a singularity? Why even bother trying to keep compatibility w/ older clients? What is 'failure compatibility'? We are telling the client to go away, nicely (smile). What you think we should replace hrpc0x0005 with? this - these http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto https://reviews.apache.org/r/4096/#comment14281 How does RpcRequestWithHeaderProto relate to ConnectionHeaderProto? This text should say? Would be nice to have illustration on how the back and forth work. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto https://reviews.apache.org/r/4096/#comment14282 We'll send this String each time? http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto https://reviews.apache.org/r/4096/#comment14283 Which part in here is the 'header'? How does it relate to ConnectionHeaderProto? request can be an Invocation/Writable? Or a protobuf? Do we need a length in here? http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto https://reviews.apache.org/r/4096/#comment14284 Should this precede the response? So if false, a response follows else an exception? Do we need a length here? Where is the header that the message name refers too? - Michael On 2012-03-30 23:29:32, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4096/ bq. --- bq. bq. (Updated 2012-03-30 23:29:32) bq. bq. bq. Review request for Michael Stack and Benoit Sigoure. bq. bq. bq. Summary bq. --- bq. bq. Switch RPC call envelope/headers to PBs bq. bq. bq. This addresses bug HBASE-5451. bq. https://issues.apache.org/jira/browse/HBASE-5451 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/DataOutputOutputStream.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/protobuf/generated/RPCMessageProtos.java PRE-CREATION bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/security/User.java 1307644 bq. http://svn.apache.org/repos/asf/hbase/trunk/src/main/protobuf/RPCMessageProto.proto PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/4096/diff bq. bq. bq. Testing bq. --- bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Switch RPC call envelope/headers to PBs --- Key: HBASE-5451 URL: https://issues.apache.org/jira/browse/HBASE-5451 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Affects Versions: 0.94.0 Reporter: Todd Lipcon Assignee: Devaraj Das Fix For: 0.96.0 Attachments: rpc-proto.2.txt, rpc-proto.3.txt,
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243942#comment-13243942 ] Jonathan Hsieh commented on HBASE-5680: --- Tried it again, and actually -- if you recompile using -Dhadoop.profile=23 without the security profile the Master comes up and does not encounter the problem. (I probably had the wrong hadoop jars in my hbase classpath). So it boils down to needing to recompile hbase against hadoop 23. Maybe we should catch this exception and warn the user to recompile HBase, or possibly put out yet another package that is compiled against 23. Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243948#comment-13243948 ] binlijin commented on HBASE-5443: - Hi guys,i have some question, why choose pb? why not avro or thrift? Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243949#comment-13243949 ] binlijin commented on HBASE-5443: - Hi guys,i have some question, why choose pb? why not avro or thrift? Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243950#comment-13243950 ] binlijin commented on HBASE-5443: - Hi guys,i have some question, why choose pb? why not avro or thrift? Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5644) [findbugs] Fix null pointer warnings.
[ https://issues.apache.org/jira/browse/HBASE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HBASE-5644: --- Attachment: NullPointerFindBugs_Analysis.xlsx [findbugs] Fix null pointer warnings. - Key: HBASE-5644 URL: https://issues.apache.org/jira/browse/HBASE-5644 Project: HBase Issue Type: Sub-task Components: scripts Reporter: Jonathan Hsieh Assignee: Uma Maheswara Rao G Attachments: NullPointerFindBugs_Analysis.xlsx See https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Fix the NP category -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5644) [findbugs] Fix null pointer warnings.
[ https://issues.apache.org/jira/browse/HBASE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HBASE-5644: --- Attachment: HBASE-5644.patch Attached the patch and analysis sheet. [findbugs] Fix null pointer warnings. - Key: HBASE-5644 URL: https://issues.apache.org/jira/browse/HBASE-5644 Project: HBase Issue Type: Sub-task Components: scripts Reporter: Jonathan Hsieh Assignee: Uma Maheswara Rao G Attachments: HBASE-5644.patch, NullPointerFindBugs_Analysis.xlsx See https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Fix the NP category -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.
[ https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243952#comment-13243952 ] Benoit Sigoure commented on HBASE-5436: --- Ping? Can we get this trivial change in the next point release of 0.92.x? Right-size the map when reading attributes. --- Key: HBASE-5436 URL: https://issues.apache.org/jira/browse/HBASE-5436 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Trivial Labels: performance Fix For: 0.94.0 Attachments: 0001-Right-size-the-map-when-reading-attributes.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5443) Add PB-based calls to HRegionInterface
[ https://issues.apache.org/jira/browse/HBASE-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243954#comment-13243954 ] Jimmy Xiang commented on HBASE-5443: The main reason is that the HBase writable RPC already supports pb. Hadoop uses pb too. Add PB-based calls to HRegionInterface -- Key: HBASE-5443 URL: https://issues.apache.org/jira/browse/HBASE-5443 Project: HBase Issue Type: Task Components: ipc, master, migration, regionserver Reporter: Todd Lipcon Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: region_java-proto-mapping.pdf -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5436) Right-size the map when reading attributes.
[ https://issues.apache.org/jira/browse/HBASE-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243958#comment-13243958 ] Ted Yu commented on HBASE-5436: --- Integrated to 0.92 branch. Right-size the map when reading attributes. --- Key: HBASE-5436 URL: https://issues.apache.org/jira/browse/HBASE-5436 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Benoit Sigoure Assignee: Benoit Sigoure Priority: Trivial Labels: performance Fix For: 0.94.0 Attachments: 0001-Right-size-the-map-when-reading-attributes.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243967#comment-13243967 ] ramkrishna.s.vasudevan commented on HBASE-5680: --- @Jon Let me try once again. I tried compiling using 23 profile. May be i missed something. Will try again and update on that. Thanks Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5644) [findbugs] Fix null pointer warnings.
[ https://issues.apache.org/jira/browse/HBASE-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HBASE-5644: --- Status: Patch Available (was: Open) [findbugs] Fix null pointer warnings. - Key: HBASE-5644 URL: https://issues.apache.org/jira/browse/HBASE-5644 Project: HBase Issue Type: Sub-task Components: scripts Reporter: Jonathan Hsieh Assignee: Uma Maheswara Rao G Attachments: HBASE-5644.patch, NullPointerFindBugs_Analysis.xlsx See https://builds.apache.org/job/PreCommit-HBASE-Build/1313//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Fix the NP category -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243970#comment-13243970 ] Jonathan Hsieh edited comment on HBASE-5680 at 4/2/12 4:33 AM: --- @Ram -- when I did it i used 'mvn package -DskipTests -Dhadoop.profile=23' and then ran from a copy of the dir generated in target/hbase-xxx/hbase-xxx. If you run from the directory you ran the mvn command in, I think hbase scripts will pick up hbase from that dir, or possibly the 1.0.0 version from the ~/.m2 dir. I think this is what caught me the first time I tried this. was (Author: jmhsieh): @Ram -- when I did it i used 'mvn package -DskipTests -Dhadoop.profile=23' and then ran from a copy of the dir generated in target/hbase-xxx/hbase-xxx. If you run from the directory you ran the mvn command in, I think hbase will scripts will picked up hbase from that dir, or possibly the 1.0.0 version from the ~/.m2 dir. I think this is what caught me the first time I tried this. Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243970#comment-13243970 ] Jonathan Hsieh commented on HBASE-5680: --- @Ram -- when I did it i used 'mvn package -DskipTests -Dhadoop.profile=23' and then ran from a copy of the dir generated in target/hbase-xxx/hbase-xxx. If you run from the directory you ran the mvn command in, I think hbase will scripts will picked up hbase from that dir, or possibly the 1.0.0 version from the ~/.m2 dir. I think this is what caught me the first time I tried this. Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-5697) Audit HBase for usage of deprecated hadoop 0.20.x property names.
Audit HBase for usage of deprecated hadoop 0.20.x property names. - Key: HBASE-5697 URL: https://issues.apache.org/jira/browse/HBASE-5697 Project: HBase Issue Type: Task Reporter: Jonathan Hsieh Many xml config properties in Hadoop have changed in 0.23. We should audit hbase to insulate it from hadoop property name changes. Here is a list of the hadoop property name changes: http://hadoop.apache.org/common/docs/r0.23.1/hadoop-project-dist/hadoop-common/DeprecatedProperties.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243974#comment-13243974 ] ramkrishna.s.vasudevan commented on HBASE-5680: --- @Jon 'mvn package -DskipTests -Dhadoop.profile=23' - used this command to generate a jar of 0.94. We have 0.94 installation. Already our installation is having only 0.23.1 hadoop jars in the classpath. Now we replaced the jar created from step 1 and tried again. Ended up in getting the same exception. Are we missing something here? Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243976#comment-13243976 ] Jonathan Hsieh commented on HBASE-5680: --- Hm.. I didn't add all the hadoop 23 jars -- I added: hadoop-auth-*.jar hadoop-common-*.jar hadoop-hdfs-*.jar I didn't add any of the mapreduce or yarn jars maybe that has something to do with it. A lot of the recompile was due to changes in MR2 (classes turned into interfaces, and shims to allow compilation). Can you try moving those jars out of the way? By just including those jars, I've run some recompiled mr jobs using a command line like this: HADOOP_HOME=`hbase classpath` hadoop jar xxx.jar Class args.. Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5689) Skipping RecoveredEdits may cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243983#comment-13243983 ] ramkrishna.s.vasudevan commented on HBASE-5689: --- @Chunhui +1 on patch Chunhui. I was bit hesitant to change that main logic. But after checking different scenarios this change is necessary to have both performance and to address this JIRA. Good one. Thanks a lot. :) Skipping RecoveredEdits may cause data loss --- Key: HBASE-5689 URL: https://issues.apache.org/jira/browse/HBASE-5689 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.94.0 Reporter: chunhui shen Assignee: chunhui shen Priority: Critical Fix For: 0.94.0 Attachments: 5689-simplified.txt, 5689-testcase.patch, HBASE-5689.patch, HBASE-5689.patch Let's see the following scenario: 1.Region is on the server A 2.put KV(r1-v1) to the region 3.move region from server A to server B 4.put KV(r2-v2) to the region 5.move region from server B to server A 6.put KV(r3-v3) to the region 7.kill -9 server B and start it 8.kill -9 server A and start it 9.scan the region, we could only get two KV(r1-v1,r2-v2), the third KV(r3-v3) is lost. Let's analyse the upper scenario from the code: 1.the edit logs of KV(r1-v1) and KV(r3-v3) are both recorded in the same hlog file on server A. 2.when we split server B's hlog file in the process of ServerShutdownHandler, we create one RecoveredEdits file f1 for the region. 2.when we split server A's hlog file in the process of ServerShutdownHandler, we create another RecoveredEdits file f2 for the region. 3.however, RecoveredEdits file f2 will be skiped when initializing region HRegion#replayRecoveredEditsIfAny {code} for (Path edits: files) { if (edits == null || !this.fs.exists(edits)) { LOG.warn(Null or non-existent edits file: + edits); continue; } if (isZeroLengthThenDelete(this.fs, edits)) continue; if (checkSafeToSkip) { Path higher = files.higher(edits); long maxSeqId = Long.MAX_VALUE; if (higher != null) { // Edit file name pattern, HLog.EDITFILES_NAME_PATTERN: -?[0-9]+ String fileName = higher.getName(); maxSeqId = Math.abs(Long.parseLong(fileName)); } if (maxSeqId = minSeqId) { String msg = Maximum possible sequenceid for this log is + maxSeqId + , skipped the whole file, path= + edits; LOG.debug(msg); continue; } else { checkSafeToSkip = false; } } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5680) Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1
[ https://issues.apache.org/jira/browse/HBASE-5680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243985#comment-13243985 ] ramkrishna.s.vasudevan commented on HBASE-5680: --- @Jon Its working now. Cleared all the classes inside target and regenerated. Thanks Jon. Hbase94 and Hbase 92.2 is not compatible with the Hadoop 23.1 -- Key: HBASE-5680 URL: https://issues.apache.org/jira/browse/HBASE-5680 Project: HBase Issue Type: Bug Components: master Reporter: Kristam Subba Swathi Hmaster is not able to start because of the following error Please find the following error 2012-03-30 11:12:19,487 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/protocol/FSConstants$SafeModeAction at org.apache.hadoop.hbase.util.FSUtils.waitOnSafeMode(FSUtils.java:524) at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:324) at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:127) at org.apache.hadoop.hbase.master.MasterFileSystem.init(MasterFileSystem.java:112) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:496) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:363) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.protocol.FSConstants$SafeModeAction at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 7 more There is a change in the FSConstants -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira