[jira] [Reopened] (HBASE-5809) Avoid move api to take the destination server same as the source server.
[ https://issues.apache.org/jira/browse/HBASE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5809: --- The test failure was visible on Hadoop QA as well: https://builds.apache.org/job/PreCommit-HBASE-Build/1594//testReport/org.apache.hadoop.hbase.coprocessor/TestMasterObserver/testRegionTransitionOperations/ Avoid move api to take the destination server same as the source server. Key: HBASE-5809 URL: https://issues.apache.org/jira/browse/HBASE-5809 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1 Reporter: ramkrishna.s.vasudevan Assignee: rajeshbabu Priority: Minor Labels: client Fix For: 0.96.0 Attachments: HBASE-5809.patch, HBASE-5809.patch In Move currently we take any destination specified and if the destination is same as the source we still do unassign and assign. Here we can have problems due to RegionAlreadyInTransitionException and thus hanging the region in RIT for long time. We can avoid this scenario by not allowing the move to happen in this scenario. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5824) HRegion.incrementColumnValue is not used in trunk
[ https://issues.apache.org/jira/browse/HBASE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5824: --- I can reproduce one of the test failures reported by Hadoop QA: {code} testConstraintFails(org.apache.hadoop.hbase.constraint.TestConstraint) Time elapsed: 3.174 sec ERROR! org.apache.hadoop.hbase.constraint.ConstraintException: org.apache.hadoop.hbase.constraint.ConstraintException: AllFailConstraint fails for all puts ... Caused by: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.constraint.ConstraintException: AllFailConstraint fails for all puts at org.apache.hadoop.hbase.constraint.AllFailConstraint.check(AllFailConstraint.java:29) at org.apache.hadoop.hbase.constraint.ConstraintProcessor.prePut(ConstraintProcessor.java:87) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:656) at org.apache.hadoop.hbase.regionserver.HRegion.internalPut(HRegion.java:2434) at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1891) at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1857) at org.apache.hadoop.hbase.regionserver.RegionServer.mutate(RegionServer.java:523) {code} HRegion.incrementColumnValue is not used in trunk - Key: HBASE-5824 URL: https://issues.apache.org/jira/browse/HBASE-5824 Project: HBase Issue Type: Bug Reporter: Elliott Clark Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: hbase-5824.patch, hbase-5824_v2.patch on 0.94 a call to client.HTable#incrementColumnValue will cause HRegion#incrementColumnValue. On trunk all calls to HTable.incrementColumnValue got to HRegion#increment. My guess is that HTable#incrementColumnValue and HTable#increment serialize to the same thing over the wire so that the remote HRegionServer no longer knows which htable method was called. To repro I checked out trunk and put a break point in HRegion#incrementColumnValue and then ran TestFromClientSide. The breakpoint wasn't hit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5620) Convert the client protocol of HRegionInterface to PB
[ https://issues.apache.org/jira/browse/HBASE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5620: --- As you can see in https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK-security/171/, there were 12 more test failures. To narrow the checkin that caused additional test failures, I locally backed out HBASE-5684 and HBASE-5747. TestAccessControlFilter#testQualifierAccess still failed with access permission error. After backing out this JIRA, the test passed. Convert the client protocol of HRegionInterface to PB - Key: HBASE-5620 URL: https://issues.apache.org/jira/browse/HBASE-5620 Project: HBase Issue Type: Sub-task Components: ipc, master, migration, regionserver Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.96.0 Attachments: hbase-5620_v3.patch, hbase-5620_v4.patch, hbase-5620_v4.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5747) Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test
[ https://issues.apache.org/jira/browse/HBASE-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5747: --- Trunk builds #2757 to #2760 all failed due to some hanging test. Forward port hbase-5708 [89-fb] Make MiniMapRedCluster directory a subdirectory of target/test Key: HBASE-5747 URL: https://issues.apache.org/jira/browse/HBASE-5747 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Priority: Blocker Fix For: 0.96.0 Attachments: 5474.txt, 5474v2.txt, 5474v3 (1).txt, 5474v3.txt, 5708v4.txt, 5708v4.txt Forward port as much as we can of Mikhail's hard-won test cleanups over on 0.89 branch Will improve our being able to run unit tests in //. He also found a few bugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5727) secure hbase build broke because of 'HBASE-5451 Switch RPC call envelope/headers to PBs'
[ https://issues.apache.org/jira/browse/HBASE-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5727: --- Compilation failed because unresolved conflict in security/src/main/java/org/apache/hadoop/hbase/ipc/SecureClient.java: {code} .mine final DataOutputBuffer d = new DataOutputBuffer(); {code} secure hbase build broke because of 'HBASE-5451 Switch RPC call envelope/headers to PBs' Key: HBASE-5727 URL: https://issues.apache.org/jira/browse/HBASE-5727 Project: HBase Issue Type: Bug Reporter: stack Assignee: Devaraj Das Priority: Blocker Fix For: 0.96.0 Attachments: 5727.1.patch, 5727.2.patch, 5727.patch If you build with the security profile -- i.e. add '-P security' on the command line -- you'll see that the secure build is broke since we messed in rpc. Assigning Deveraj to take a look. If you can't work on this now DD, just give it back to me and I'll have a go at it. Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5727) secure hbase build broke because of 'HBASE-5451 Switch RPC call envelope/headers to PBs'
[ https://issues.apache.org/jira/browse/HBASE-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5727: --- The compilation issue is real. secure hbase build broke because of 'HBASE-5451 Switch RPC call envelope/headers to PBs' Key: HBASE-5727 URL: https://issues.apache.org/jira/browse/HBASE-5727 Project: HBase Issue Type: Bug Reporter: stack Assignee: Devaraj Das Priority: Blocker Fix For: 0.96.0 Attachments: 5727.1.patch, 5727.2.patch, 5727.patch If you build with the security profile -- i.e. add '-P security' on the command line -- you'll see that the secure build is broke since we messed in rpc. Assigning Deveraj to take a look. If you can't work on this now DD, just give it back to me and I'll have a go at it. Thanks. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5213) hbase master stop does not bring down backup masters
[ https://issues.apache.org/jira/browse/HBASE-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5213: --- TestLogRolling hangs in 0.90 hbase master stop does not bring down backup masters -- Key: HBASE-5213 URL: https://issues.apache.org/jira/browse/HBASE-5213 Project: HBase Issue Type: Bug Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 Reporter: Gregory Chanan Assignee: Gregory Chanan Priority: Minor Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: 5213.jstack, HBASE-5213-v0-trunk.patch, HBASE-5213-v1-trunk.patch, HBASE-5213-v2-90.patch, HBASE-5213-v2-92.patch, HBASE-5213-v2-trunk.patch Typing hbase master stop produces the following message: stop Start cluster shutdown; Master signals RegionServer shutdown It seems like backup masters should be considered part of the cluster, but they are not brought down by hbase master stop. stop-hbase.sh does correctly bring down the backup masters. The same behavior is observed when a client app makes use of the client API HBaseAdmin.shutdown() http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#shutdown() -- this isn't too surprising since I think hbase master stop just calls this API. It seems like HBASE-1448 address this; perhaps there was a regression? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5724) Row cache of KeyValue should be cleared in readFields().
[ https://issues.apache.org/jira/browse/HBASE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5724: --- 0.90 branch doesn't compile now: {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.0.2:testCompile (default-testCompile) on project hbase: Compilation failure [ERROR] /x1/jenkins/jenkins-slave/workspace/hbase-0.90/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java:[404,64] cannot find symbol [ERROR] symbol : variable WritableUtils [ERROR] location: class org.apache.hadoop.hbase.TestKeyValue {code} Row cache of KeyValue should be cleared in readFields(). Key: HBASE-5724 URL: https://issues.apache.org/jira/browse/HBASE-5724 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Teruyoshi Zenmyo Assignee: Teruyoshi Zenmyo Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: 5724.092.txt, HBASE-5724.txt, HBASE-5724v2.txt KeyValue does not clear its row cache in reading new values (readFields()). Therefore, If a KeyValue (kv) which caches its row bytes reads another KeyValue instance, kv.getRow() returns a wrong value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5711) Tests are failing with incorrect data directory permissions.
[ https://issues.apache.org/jira/browse/HBASE-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5711: --- @ 04/Apr/12 18:30, TestRowProcessorEndpoint was reported by Hadoop QA to fail. After I reverted the patch, TestRowProcessorEndpoint passed on trunk. Tests are failing with incorrect data directory permissions. Key: HBASE-5711 URL: https://issues.apache.org/jira/browse/HBASE-5711 Project: HBase Issue Type: Bug Components: test Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: 0.92.2, 0.94.0 Attachments: HBASE-5711.patch When we run some tests in Hbase (TestAdmin), it is failing with following error. {quote} Starting DataNode 0 with dfs.data.dir: E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb\dfs\data\data1,E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb\dfs\data\data2 2012-04-04 18:04:51,036 WARN [main] impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-datanode.properties, hadoop-metrics2.properties 2012-04-04 18:04:51,255 WARN [main] datanode.DataNode(1548): Invalid directory in dfs.data.dir: Incorrect permission for E:/Repositories/Hbase/target/test-data/5ff23198-892e-4f1c-8022-b3d9969fcf0b/dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb/dfs/data/data1, expected: rwxr-xr-x, while actual: rwx-- 2012-04-04 18:04:51,411 WARN [main] datanode.DataNode(1548): Invalid directory in dfs.data.dir: Incorrect permission for E:/Repositories/Hbase/target/test-data/5ff23198-892e-4f1c-8022-b3d9969fcf0b/dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb/dfs/data/data2, expected: rwxr-xr-x, while actual: rwx-- 2012-04-04 18:04:51,411 ERROR [main] datanode.DataNode(1554): All directories in dfs.data.dir are invalid. 2012-04-04 18:04:51,411 INFO [main] hbase.HBaseTestingUtility(684): Shutting down minicluster 2012-04-04 18:04:51,646 WARN [main] hbase.HBaseTestingUtility(696): Failed delete of E:\Repositories\Hbase\target\test-data\5ff23198-892e-4f1c-8022-b3d9969fcf0b\dfscluster_0ecc6984-1925-4870-ac7c-439fceede4cb 2012-04-04 18:04:51,646 INFO [main] hbase.HBaseTestingUtility(700): Minicluster is down {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5673) The OOM problem of IPC client call cause all handle block
[ https://issues.apache.org/jira/browse/HBASE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5673: --- The patch applies cleanly to TRUNK. It was imprudent to integrate the patch without going through QA cycle. Now all branches are broken. The OOM problem of IPC client call cause all handle block -- Key: HBASE-5673 URL: https://issues.apache.org/jira/browse/HBASE-5673 Project: HBase Issue Type: Bug Affects Versions: 0.90.6 Environment: 0.90.6 Reporter: xufeng Assignee: xufeng Fix For: 0.90.7, 0.92.2, 0.94.1 Attachments: HBASE-5673-90-V2.patch, HBASE-5673-90.patch if HBaseClient meet unable to create new native thread exception, the call will never complete because it be lost in calls queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5564) Bulkload is discarding duplicate records
[ https://issues.apache.org/jira/browse/HBASE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5564: --- By reverting the patch applied to trunk, TestImportTsv#testMROnTableWithCustomMapper passes. Bulkload is discarding duplicate records Key: HBASE-5564 URL: https://issues.apache.org/jira/browse/HBASE-5564 Project: HBase Issue Type: Bug Components: mapreduce Affects Versions: 0.96.0 Environment: HBase 0.92 Reporter: Laxman Assignee: Laxman Labels: bulkloader Fix For: 0.96.0 Attachments: 5564.lint, 5564v5.txt, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.1.patch, HBASE-5564_trunk.2.patch, HBASE-5564_trunk.3.patch, HBASE-5564_trunk.4_final.patch, HBASE-5564_trunk.patch Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split. Duplicate records are considered if the records are from diffrent different splits. Version under test: HBase 0.92 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5592) Make it easier to get a table from shell
[ https://issues.apache.org/jira/browse/HBASE-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5592: --- The reversion didn't happen, right ? This would be in conflict with Jesse's current work. Make it easier to get a table from shell Key: HBASE-5592 URL: https://issues.apache.org/jira/browse/HBASE-5592 Project: HBase Issue Type: Improvement Components: shell Affects Versions: 0.94.0 Reporter: Ben West Assignee: Ben West Priority: Trivial Labels: shell Fix For: 0.96.0 Attachments: publicTable.patch The one argument constructor to HTable was removed at some point, which means that you now have to pass in a Configuration to instantiate an HTable. This is annoying for me when I create quick scripts. This JIRA is a tiny patch which lets you get an HTable instance in the shell by doing {code}foo_table = @shell.hbase_table('foo').table{code} Basically, it is changing table to be a public member rather than a private one. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss
[ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5568: --- Patch for 0.92 hasn't been integrated. Multi concurrent flushcache() for one region could cause data loss -- Key: HBASE-5568 URL: https://issues.apache.org/jira/browse/HBASE-5568 Project: HBase Issue Type: Bug Components: regionserver Reporter: chunhui shen Assignee: chunhui shen Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0 Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin. However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong. At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region. Logs in RS for region e9d827913a056e696c39bc569ea3 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3 f99f., current region memstore size 128.0m 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3 f99f., current region memstore size 134.8m 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3 f99f., current region memstore size 6.8m 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5399) Cut the link between the client and the zookeeper ensemble
[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5399: --- TestZooKeeper.testClientSessionExpired fails on Hadoop QA after this patch went in. Cut the link between the client and the zookeeper ensemble -- Key: HBASE-5399 URL: https://issues.apache.org/jira/browse/HBASE-5399 Project: HBase Issue Type: Improvement Components: client Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.96.0 Attachments: 5399.v27.patch, 5399.v38.patch, 5399.v39.patch, 5399.v40.patch, 5399.v41.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399.v42.patch, 5399_inprogress.patch, 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, 5399_inprogress.v3.patch, 5399_inprogress.v32.patch, 5399_inprogress.v9.patch, nochange.patch The link is often considered as an issue, for various reasons. One of them being that there is a limit on the number of connection that ZK can manage. Stack was suggesting as well to remove the link to master from HConnection. There are choices to be made considering the existing API (that we don't want to break). The first patches I will submit on hadoop-qa should not be committed: they are here to show the progress on the direction taken. ZooKeeper is used for: - public getter, to let the client do whatever he wants, and close ZooKeeper when closing the connection = we have to deprecate this but keep it. - read get master address to create a master = now done with a temporary zookeeper connection - read root location = now done with a temporary zookeeper connection, but questionable. Used in public function locateRegion. To be reworked. - read cluster id = now done once with a temporary zookeeper connection. - check if base done is available = now done once with a zookeeper connection given as a parameter - isTableDisabled/isTableAvailable = public functions, now done with a temporary zookeeper connection. - Called internally from HBaseAdmin and HTable - getCurrentNrHRS(): public function to get the number of region servers and create a pool of thread = now done with a temporary zookeeper connection - Master is used for: - getMaster public getter, as for ZooKeeper = we have to deprecate this but keep it. - isMasterRunning(): public function, used internally by HMerge HBaseAdmin - getHTableDescriptor*: public functions offering access to the master. = we could make them using a temporary master connection as well. Main points are: - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a strongly coupled architecture ;-). This can be changed, but requires a lot of modifications in these classes (likely adding a class in the middle of the hierarchy, something like that). Anyway, non connected client will always be really slower, because it's a tcp connection, and establishing a tcp connection is slow. - having a link between ZK and all the client seems to make sense for some Use Cases. However, it won't scale if a TCP connection is required for every client - if we move the table descriptor part away from the client, we need to find a new place for it. - we will have the same issue if HBaseAdmin (for both ZK Master), may be we can put a timeout on the connection. That would make the whole system less deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5473) Metrics does not push pread time
[ https://issues.apache.org/jira/browse/HBASE-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5473: --- 0.92 build #305 failed due to compilation error: {code} [ERROR] https://builds.apache.org/job/HBase-0.92/ws/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java:[313,10] cannot find symbol [ERROR] symbol : variable fsPreadLatency [ERROR] location: class org.apache.hadoop.hbase.regionserver.metrics.RegionServerMetrics {code} Metrics does not push pread time Key: HBASE-5473 URL: https://issues.apache.org/jira/browse/HBASE-5473 Project: HBase Issue Type: Bug Components: metrics Reporter: dhruba borthakur Assignee: dhruba borthakur Priority: Minor Fix For: 0.92.1, 0.94.0 Attachments: D1947.1.patch, D1947.1.patch, D1947.1.patch, D1947.patch The RegionServerMetrics is not pushing the pread times to the MetricsRecord -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5318) Support Eclipse Indigo
[ https://issues.apache.org/jira/browse/HBASE-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5318: --- After reverting this patch, TestInfoServers passed. Support Eclipse Indigo --- Key: HBASE-5318 URL: https://issues.apache.org/jira/browse/HBASE-5318 Project: HBase Issue Type: Improvement Components: build Affects Versions: 0.94.0 Environment: Eclipse Indigo (1.4.1) which includes m2eclipse (1.0 SR1). Reporter: Jesse Yates Assignee: Jesse Yates Priority: Minor Labels: maven Attachments: mvn_HBASE-5318_r0.patch The current 'standard' release of Eclipse (indigo) comes with m2eclipse installed. However, as of m2e v1.0, interesting lifecycle phases are now handled via a 'connector'. However, several of the plugins we use don't support connectors. This means that eclipse bails out and won't build the project or view it as 'working' even though it builds just fine from the the command line. Since Eclipse is one of the major java IDEs and that Indigo has been around for a while, we should make it easy to for new devs to pick up the code and for older devs to upgrade painlessly. The original build should not be modified in any significant way. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5235) HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions.
[ https://issues.apache.org/jira/browse/HBASE-5235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5235: --- Patch should be integrated to 0.92 branch as well. HLogSplitter writer thread's streams not getting closed when any of the writer threads has exceptions. -- Key: HBASE-5235 URL: https://issues.apache.org/jira/browse/HBASE-5235 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.90.5 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.92.1, 0.90.6 Attachments: HBASE-5235_0.90.patch, HBASE-5235_0.90_1.patch, HBASE-5235_0.90_2.patch, HBASE-5235_trunk.patch Pls find the analysis. Correct me if am wrong {code} 2012-01-15 05:14:02,374 FATAL org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: WriterThread-9 Got while writing log entry to log java.io.IOException: All datanodes 10.18.40.200:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3373) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2811) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3026) {code} Here we have an exception in one of the writer threads. If any exception we try to hold it in an Atomic variable {code} private void writerThreadError(Throwable t) { thrown.compareAndSet(null, t); } {code} In the finally block of splitLog we try to close the streams. {code} for (WriterThread t: writerThreads) { try { t.join(); } catch (InterruptedException ie) { throw new IOException(ie); } checkForErrors(); } LOG.info(Split writers finished); return closeStreams(); {code} Inside checkForErrors {code} private void checkForErrors() throws IOException { Throwable thrown = this.thrown.get(); if (thrown == null) return; if (thrown instanceof IOException) { throw (IOException)thrown; } else { throw new RuntimeException(thrown); } } So once we throw the exception the DFSStreamer threads are not getting closed. {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5099) ZK event thread waiting for root region assignment may block server shutdown handler for the region sever the root region was on
[ https://issues.apache.org/jira/browse/HBASE-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5099: --- 0.92 Jenkins builds have failed 4 times in a roll. TestReplication#queueFailover failed in builds 217 and 218. It failed consistently on MacBook as well. Rolling back the patches. ZK event thread waiting for root region assignment may block server shutdown handler for the region sever the root region was on Key: HBASE-5099 URL: https://issues.apache.org/jira/browse/HBASE-5099 Project: HBase Issue Type: Bug Affects Versions: 0.92.0, 0.94.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Fix For: 0.92.0, 0.94.0 Attachments: 5099.92, ZK-event-thread-waiting-for-root.png, distributed-log-splitting-hangs.png, hbase-5099-v2.patch, hbase-5099-v3.patch, hbase-5099-v4.patch, hbase-5099-v5.patch, hbase-5099-v6.patch, hbase-5099.patch A RS died. The ServerShutdownHandler kicked in and started the logspliting. SpliLogManager installed the tasks asynchronously, then started to wait for them to complete. The task znodes were not created actually. The requests were just queued. At this time, the zookeeper connection expired. HMaster tried to recover the expired ZK session. During the recovery, a new zookeeper connection was created. However, this master became the new master again. It tried to assign root and meta. Because the dead RS got the old root region, the master needs to wait for the log splitting to complete. This waiting holds the zookeeper event thread. So the async create split task is never retried since there is only one event thread, which is waiting for the root region assigned. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5021) Enforce upper bound on timestamp
[ https://issues.apache.org/jira/browse/HBASE-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5021: --- TestHeapSize fails on Jenkins Enforce upper bound on timestamp Key: HBASE-5021 URL: https://issues.apache.org/jira/browse/HBASE-5021 Project: HBase Issue Type: Improvement Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Critical Fix For: 0.94.0 Attachments: D849.1.patch, D849.2.patch, D849.3.patch, HBASE-5021-trunk.patch We have been getting hit with performance problems on our time-series database due to invalid timestamps being inserted by the timestamp. We are working on adding proper checks to app server, but production performance could be severely impacted with significant recovery time if something slips past. Since timestamps are considered a fundamental part of the HBase schema multiple optimizations use timestamp information, we should allow the option to sanity check the upper bound on the server-side in HBase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HBASE-5029) TestDistributedLogSplitting fails on occasion
[ https://issues.apache.org/jira/browse/HBASE-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihong Yu reopened HBASE-5029: --- The test failed again in build #193 TestDistributedLogSplitting fails on occasion - Key: HBASE-5029 URL: https://issues.apache.org/jira/browse/HBASE-5029 Project: HBase Issue Type: Bug Reporter: stack Assignee: Prakash Khemani Fix For: 0.92.0 Attachments: 0001-HBASE-5029-jira-TestDistributedLogSplitting-fails-on.patch, HBASE-5029.D891.1.patch, HBASE-5029.D891.2.patch This is how it usually fails: https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/lastCompletedBuild/testReport/org.apache.hadoop.hbase.master/TestDistributedLogSplitting/testWorkerAbort/ Assigning mighty Prakash since he offered to take a looksee. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira