[jira] [Commented] (HBASE-9874) Append and Increment operation drops Tags
[ https://issues.apache.org/jira/browse/HBASE-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812277#comment-13812277 ] Hadoop QA commented on HBASE-9874: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611806/HBASE-9874_V3.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7719//console This message is automatically generated. Append and Increment operation drops Tags - Key: HBASE-9874 URL: https://issues.apache.org/jira/browse/HBASE-9874 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.98.0 Attachments: AccessController.postMutationBeforeWAL.txt, HBASE-9874.patch, HBASE-9874_V2.patch, HBASE-9874_V3.patch We should consider tags in the existing cells as well as tags coming in the cells within Increment/Append -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812292#comment-13812292 ] Hudson commented on HBASE-8942: --- FAILURE: Integrated in hbase-0.96 #178 (See [https://builds.apache.org/job/hbase-0.96/178/]) HBASE-8942 DFS errors during a read operation (get/scan), may cause write outliers (stack: rev 1538318) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9855) evictBlocksByHfileName improvement for bucket cache
[ https://issues.apache.org/jira/browse/HBASE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812291#comment-13812291 ] Hudson commented on HBASE-9855: --- FAILURE: Integrated in hbase-0.96 #178 (See [https://builds.apache.org/job/hbase-0.96/178/]) HBASE-9855 evictBlocksByHfileName improvement for bucket cache (stack: rev 1538319) * /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ConcurrentIndex.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java evictBlocksByHfileName improvement for bucket cache --- Key: HBASE-9855 URL: https://issues.apache.org/jira/browse/HBASE-9855 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.98.0, 0.96.1 Attachments: HBase-9855-v4.txt In deed, it comes from fb's l2 cache by [~avf]'s nice work, i just did a simple backport here. It could improve a linear-time search through the whole cache map into a log-access-time map search. I did a small bench, showed it brings a bit gc overhead, but considering the evict on close triggered by frequent compaction activity, seems reasonable? and i thought bring a evictOnClose config into BucketCache ctor and only put/remove the new index map while evictOnClose is true, seems this value could be set by each family schema, but BucketCache is a global instance not per each family, so just ignore it rightnow... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812319#comment-13812319 ] Hudson commented on HBASE-8942: --- SUCCESS: Integrated in HBase-TRUNK #4665 (See [https://builds.apache.org/job/HBase-TRUNK/4665/]) HBASE-8942 DFS errors during a read operation (get/scan), may cause write outliers (stack: rev 1538317) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9855) evictBlocksByHfileName improvement for bucket cache
[ https://issues.apache.org/jira/browse/HBASE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812318#comment-13812318 ] Hudson commented on HBASE-9855: --- SUCCESS: Integrated in HBase-TRUNK #4665 (See [https://builds.apache.org/job/HBase-TRUNK/4665/]) HBASE-9855 evictBlocksByHfileName improvement for bucket cache (stack: rev 1538320) * /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ConcurrentIndex.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java evictBlocksByHfileName improvement for bucket cache --- Key: HBASE-9855 URL: https://issues.apache.org/jira/browse/HBASE-9855 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.98.0, 0.96.1 Attachments: HBase-9855-v4.txt In deed, it comes from fb's l2 cache by [~avf]'s nice work, i just did a simple backport here. It could improve a linear-time search through the whole cache map into a log-access-time map search. I did a small bench, showed it brings a bit gc overhead, but considering the evict on close triggered by frequent compaction activity, seems reasonable? and i thought bring a evictOnClose config into BucketCache ctor and only put/remove the new index map while evictOnClose is true, seems this value could be set by each family schema, but BucketCache is a global instance not per each family, so just ignore it rightnow... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9855) evictBlocksByHfileName improvement for bucket cache
[ https://issues.apache.org/jira/browse/HBASE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812339#comment-13812339 ] Hudson commented on HBASE-9855: --- FAILURE: Integrated in hbase-0.96-hadoop2 #112 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/112/]) HBASE-9855 evictBlocksByHfileName improvement for bucket cache (stack: rev 1538319) * /hbase/branches/0.96/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ConcurrentIndex.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java evictBlocksByHfileName improvement for bucket cache --- Key: HBASE-9855 URL: https://issues.apache.org/jira/browse/HBASE-9855 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.98.0, 0.96.1 Attachments: HBase-9855-v4.txt In deed, it comes from fb's l2 cache by [~avf]'s nice work, i just did a simple backport here. It could improve a linear-time search through the whole cache map into a log-access-time map search. I did a small bench, showed it brings a bit gc overhead, but considering the evict on close triggered by frequent compaction activity, seems reasonable? and i thought bring a evictOnClose config into BucketCache ctor and only put/remove the new index map while evictOnClose is true, seems this value could be set by each family schema, but BucketCache is a global instance not per each family, so just ignore it rightnow... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812340#comment-13812340 ] Hudson commented on HBASE-8942: --- FAILURE: Integrated in hbase-0.96-hadoop2 #112 (See [https://builds.apache.org/job/hbase-0.96-hadoop2/112/]) HBASE-8942 DFS errors during a read operation (get/scan), may cause write outliers (stack: rev 1538318) * /hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9855) evictBlocksByHfileName improvement for bucket cache
[ https://issues.apache.org/jira/browse/HBASE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812354#comment-13812354 ] Hudson commented on HBASE-9855: --- SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #824 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/824/]) HBASE-9855 evictBlocksByHfileName improvement for bucket cache (stack: rev 1538320) * /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/util/ConcurrentIndex.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCacheKey.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java evictBlocksByHfileName improvement for bucket cache --- Key: HBASE-9855 URL: https://issues.apache.org/jira/browse/HBASE-9855 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.98.0 Reporter: Liang Xie Assignee: Liang Xie Fix For: 0.98.0, 0.96.1 Attachments: HBase-9855-v4.txt In deed, it comes from fb's l2 cache by [~avf]'s nice work, i just did a simple backport here. It could improve a linear-time search through the whole cache map into a log-access-time map search. I did a small bench, showed it brings a bit gc overhead, but considering the evict on close triggered by frequent compaction activity, seems reasonable? and i thought bring a evictOnClose config into BucketCache ctor and only put/remove the new index map while evictOnClose is true, seems this value could be set by each family schema, but BucketCache is a global instance not per each family, so just ignore it rightnow... -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812355#comment-13812355 ] Hudson commented on HBASE-8942: --- SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #824 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/824/]) HBASE-8942 DFS errors during a read operation (get/scan), may cause write outliers (stack: rev 1538317) * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0
[ https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812388#comment-13812388 ] Eric Charles commented on HBASE-6581: - Any further progress on this one? Patch risks to be obsolete with the time. Thx. Build with hadoop.profile=3.0 - Key: HBASE-6581 URL: https://issues.apache.org/jira/browse/HBASE-6581 Project: HBase Issue Type: Bug Reporter: Eric Charles Assignee: Eric Charles Priority: Critical Fix For: 0.98.0 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, HBASE-6581-5.patch, HBASE-6581.diff, HBASE-6581.diff Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT instead of 3.0.0-SNAPSHOT in hbase-common). I can provide a patch that would move most of hadoop dependencies in their respective profiles and will define the correct hadoop deps in the 3.0 profile. Please tell me if that's ok to go this way. Thx, Eric [1] $ mvn clean install -Dhadoop.profile=3.0 [INFO] Scanning for projects... [ERROR] The build could not read 3 projects - [Help 1] [ERROR] [ERROR] The project org.apache.hbase:hbase-server:0.95-SNAPSHOT (/d/hbase.svn/hbase-server/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-common:0.95-SNAPSHOT (/d/hbase.svn/hbase-common/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21 [ERROR] [ERROR] The project org.apache.hbase:hbase-it:0.95-SNAPSHOT (/d/hbase.svn/hbase-it/pom.xml) has 3 errors [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21 [ERROR] 'dependencies.dependency.version' for org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21 [ERROR] -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9880) client.TestAsyncProcess.testWithNoClearOnFail broke on 0.96 by HBASE-9867
stack created HBASE-9880: Summary: client.TestAsyncProcess.testWithNoClearOnFail broke on 0.96 by HBASE-9867 Key: HBASE-9880 URL: https://issues.apache.org/jira/browse/HBASE-9880 Project: HBase Issue Type: Test Reporter: stack Assignee: stack It looks like the backport of HBASE-9867 broke 0.96 build (fine on trunk). This was my patch. Let me fix. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9870) HFileDataBlockEncoderImpl#diskToCacheFormat uses wrong format
[ https://issues.apache.org/jira/browse/HBASE-9870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812421#comment-13812421 ] Jimmy Xiang commented on HBASE-9870: I tried to make sure onDisk encoding always the same as the inCache one, which still gave me data loss. Perhaps I missed something. I will get rid of the inCache one and give it another try. HFileDataBlockEncoderImpl#diskToCacheFormat uses wrong format - Key: HBASE-9870 URL: https://issues.apache.org/jira/browse/HBASE-9870 Project: HBase Issue Type: Bug Reporter: Jimmy Xiang In this method, we have {code} if (block.getBlockType() == BlockType.ENCODED_DATA) { if (block.getDataBlockEncodingId() == onDisk.getId()) { // The block is already in the desired in-cache encoding. return block; } {code} This assumes onDisk encoding is the same as that of inCache. This is not true when we change the encoding of a CF. This could be one of the reasons I got data loss with online encoding change? If I make sure onDisk == inCache all the time, my ITBLL with online encoding change worked once for me. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9863) Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs
[ https://issues.apache.org/jira/browse/HBASE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812425#comment-13812425 ] Jimmy Xiang commented on HBASE-9863: For those private methods such as isTableAvailableAndInitialized() and getNamespaceTable(), can we remove synchronized, instead make sure the callers have proper synchronization? For this one: {code} public synchronized NamespaceDescriptor get(String name) throws IOException { -return get(getNamespaceTable(), name); +return zkNamespaceManager.get(name); } {code} The change is good. We still need to check is the manger is initialized, right? Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs --- Key: HBASE-9863 URL: https://issues.apache.org/jira/browse/HBASE-9863 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 9863-v1.txt, 9863-v2.txt, 9863-v3.txt TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry sometimes hung. Here were two recent occurrences: https://builds.apache.org/job/PreCommit-HBASE-Build/7676/console https://builds.apache.org/job/PreCommit-HBASE-Build/7671/console There were 9 occurrences of the following in both stack traces: {code} FifoRpcScheduler.handler1-thread-5 daemon prio=10 tid=0x09df8800 nid=0xc17 waiting for monitor entry [0x6fdf8000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:250) - waiting to lock 0x7f69b5f0 (a org.apache.hadoop.hbase.master.TableNamespaceManager) at org.apache.hadoop.hbase.master.HMaster.isTableNamespaceManagerReady(HMaster.java:3146) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3105) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1743) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1782) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38221) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1983) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92) {code} The test hung here: {code} pool-1-thread-1 prio=10 tid=0x74f7b800 nid=0x5aa5 in Object.wait() [0x74efe000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1436) - locked 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.createTable(MasterProtos.java:40372) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.createTable(HConnectionManager.java:1931) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:598) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3124) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:485) at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:486) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9863) Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs
[ https://issues.apache.org/jira/browse/HBASE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812433#comment-13812433 ] Ted Yu commented on HBASE-9863: --- bq. can we remove synchronized, instead make sure the callers have proper synchronization? isTableAvailableAndInitialized() is called by start() which is not synchronized. Doing the above would equate to adding synchronized(this) in start() before calling isTableAvailableAndInitialized(). Is that what you meant ? For 'NamespaceDescriptor get(String name)', the semantics is the same as the original method. Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs --- Key: HBASE-9863 URL: https://issues.apache.org/jira/browse/HBASE-9863 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 9863-v1.txt, 9863-v2.txt, 9863-v3.txt TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry sometimes hung. Here were two recent occurrences: https://builds.apache.org/job/PreCommit-HBASE-Build/7676/console https://builds.apache.org/job/PreCommit-HBASE-Build/7671/console There were 9 occurrences of the following in both stack traces: {code} FifoRpcScheduler.handler1-thread-5 daemon prio=10 tid=0x09df8800 nid=0xc17 waiting for monitor entry [0x6fdf8000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:250) - waiting to lock 0x7f69b5f0 (a org.apache.hadoop.hbase.master.TableNamespaceManager) at org.apache.hadoop.hbase.master.HMaster.isTableNamespaceManagerReady(HMaster.java:3146) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3105) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1743) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1782) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38221) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1983) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92) {code} The test hung here: {code} pool-1-thread-1 prio=10 tid=0x74f7b800 nid=0x5aa5 in Object.wait() [0x74efe000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1436) - locked 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.createTable(MasterProtos.java:40372) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.createTable(HConnectionManager.java:1931) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:598) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3124) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:485) at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:486) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HBASE-9420) Math.max() on syncedTillHere lacks synchronization
[ https://issues.apache.org/jira/browse/HBASE-9420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HBASE-9420. --- Resolution: Later It is better to pursue solution in HBASE-8755 Math.max() on syncedTillHere lacks synchronization -- Key: HBASE-9420 URL: https://issues.apache.org/jira/browse/HBASE-9420 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Priority: Trivial Fix For: 0.98.0 Attachments: 9420-v1.txt, 9420-v2.txt In FSHlog#syncer(), around line 1080: {code} this.syncedTillHere = Math.max(this.syncedTillHere, doneUpto); {code} Assignment to syncedTillHere after computing max value is not protected by proper synchronization. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9188) TestHBaseFsck#testNotInMetaOrDeployedHole occasionally fails
[ https://issues.apache.org/jira/browse/HBASE-9188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812451#comment-13812451 ] Ted Yu commented on HBASE-9188: --- There has been some fixes w.r.t. TestHBaseFsck. This test hasn't failed for a while. TestHBaseFsck#testNotInMetaOrDeployedHole occasionally fails Key: HBASE-9188 URL: https://issues.apache.org/jira/browse/HBASE-9188 Project: HBase Issue Type: Bug Reporter: Ted Yu From https://builds.apache.org/job/hbase-0.95-on-hadoop2/231/testReport/org.apache.hadoop.hbase.util/TestHBaseFsck/testNotInMetaOrDeployedHole/ (region tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153): expected:[NOT_IN_META_OR_DEPLOYED, HOLE_IN_REGION_CHAIN] but was:[NOT_IN_META_OR_DEPLOYED, NOT_DEPLOYED, HOLE_IN_REGION_CHAIN] Here is snippet of test output: {code} 2013-08-10 11:53:16,941 DEBUG [RS_CLOSE_REGION-vesta:38578-1] handler.CloseRegionHandler(168): set region closed state in zk successfully for region tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153. sn name: vesta.apache.org,38578,1376135290018 2013-08-10 11:53:16,941 DEBUG [RS_CLOSE_REGION-vesta:38578-1] handler.CloseRegionHandler(177): Closed region tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153. 2013-08-10 11:53:16,942 DEBUG [AM.ZK.Worker-pool-2-thread-13] master.AssignmentManager(782): Handling transition=RS_ZK_REGION_CLOSED, server=vesta.apache.org,38578,1376135290018, region=3ec6178a369a899c007fd89807b37153, current state from region state map ={3ec6178a369a899c007fd89807b37153 state=PENDING_CLOSE, ts=1376135596730, server=vesta.apache.org,38578,1376135290018} 2013-08-10 11:53:16,942 WARN [AM.ZK.Worker-pool-2-thread-13] master.RegionStates(245): Closed region 3ec6178a369a899c007fd89807b37153 still on vesta.apache.org,38578,1376135290018? Ignored, reset it to null 2013-08-10 11:53:16,942 INFO [AM.ZK.Worker-pool-2-thread-13] master.RegionStates(260): Transitioned from {3ec6178a369a899c007fd89807b37153 state=PENDING_CLOSE, ts=1376135596730, server=vesta.apache.org,38578,1376135290018} to {3ec6178a369a899c007fd89807b37153 state=CLOSED, ts=1376135596942, server=null} 2013-08-10 11:53:16,942 DEBUG [AM.ZK.Worker-pool-2-thread-13] handler.ClosedRegionHandler(92): Handling CLOSED event for 3ec6178a369a899c007fd89807b37153 2013-08-10 11:53:16,942 DEBUG [AM.ZK.Worker-pool-2-thread-13] master.AssignmentManager(1462): Table being disabled so deleting ZK node and removing from regions in transition, skipping assignment of region tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153. ... 2013-08-10 11:53:17,319 INFO [pool-1-thread-1] hbase.HBaseTestingUtility(1815): getMetaTableRows: row - tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153.{ENCODED = 3ec6178a369a899c007fd89807b37153, NAME = 'tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153.', STARTKEY = 'B', ENDKEY = 'C'} 2013-08-10 11:53:17,320 INFO [pool-1-thread-1] hbase.HBaseTestingUtility(1815): getMetaTableRows: row - tableNotInMetaOrDeployedHole,C,1376135595424.c2ae2bddbe9302c4344c13936248ac9d.{ENCODED = c2ae2bddbe9302c4344c13936248ac9d, NAME = 'tableNotInMetaOrDeployedHole,C,1376135595424.c2ae2bddbe9302c4344c13936248ac9d.', STARTKEY = 'C', ENDKEY = ''} 2013-08-10 11:53:17,320 INFO [pool-1-thread-1] util.TestHBaseFsck(231): tableNotInMetaOrDeployedHole,,1376135595423.9df585f7f666e1cd55d7b875aae22ece. 2013-08-10 11:53:17,320 INFO [pool-1-thread-1] util.TestHBaseFsck(231): tableNotInMetaOrDeployedHole,A,1376135595424.90a7d5f2211951d321c9f29f4059671f. 2013-08-10 11:53:17,320 INFO [pool-1-thread-1] util.TestHBaseFsck(231): tableNotInMetaOrDeployedHole,B,1376135595424.3ec6178a369a899c007fd89807b37153. 2013-08-10 11:53:17,320 INFO [pool-1-thread-1] util.TestHBaseFsck(231): tableNotInMetaOrDeployedHole,C,1376135595424.c2ae2bddbe9302c4344c13936248ac9d. 2013-08-10 11:53:17,326 DEBUG [pool-1-thread-1] client.ClientScanner(218): Finished region={ENCODED = 1588230740, NAME = 'hbase:meta,,1', STARTKEY = '', ENDKEY = ''} 2013-08-10 11:53:17,327 INFO [pool-1-thread-1] util.TestHBaseFsck(319): {ENCODED = 9df585f7f666e1cd55d7b875aae22ece, NAME = 'tableNotInMetaOrDeployedHole,,1376135595423.9df585f7f666e1cd55d7b875aae22ece.', STARTKEY = '', ENDKEY = 'A'}vesta.apache.org,41438,1376135289941 2013-08-10 11:53:17,328 INFO [pool-1-thread-1] util.TestHBaseFsck(319): {ENCODED = 90a7d5f2211951d321c9f29f4059671f, NAME = 'tableNotInMetaOrDeployedHole,A,1376135595424.90a7d5f2211951d321c9f29f4059671f.', STARTKEY = 'A', ENDKEY = 'B'}vesta.apache.org,38578,1376135290018 2013-08-10 11:53:17,328 INFO
[jira] [Commented] (HBASE-8741) Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs)
[ https://issues.apache.org/jira/browse/HBASE-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812477#comment-13812477 ] Himanshu Vashishtha commented on HBASE-8741: Below are some results I got while testing HLogPE, with threads count varying from 1-5 on a 5 node (1 NN, 4DN) cluster. IMO, the result are mixed and there is almost negligible much perf hit. {code} for i in `seq 1 5` ; do for j in 1 2 3; do /home/himanshu/hbase-0.97.0-SNAPSHOT/bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLogPerformanceEvaluation -verify -threads ${i} -iterations 100 -nocleanup -keySize 50 -valueSize 100; done; done {code} ||Threads||w/o patch time||w/o patch ops||w/ patch time||w/ patch ops|| |1|530.334s|1885.604ops/s|519.382s|1925.365ops/s| |1|531.314s|1882.126ops/s|524.750s|1905.669ops/s| |1|529.636s|1888.089ops/s|537.218s|1861.442ops/s| |2|796.771s|2510.132ops/s|786.245s|2543.736ops/s| |2|811.930s|2463.267ops/s|818.789s|2442.632ops/s| |2|805.139s|2484.043ops/s|792.434s|2523.869ops/s| |3|948.641s|3162.419ops/s|938.286s|3197.319ops/s| |3|968.503s|3097.564ops/s|955.333s|3140.266ops/s| |3|970.692s|3090.579ops/s|949.411s|3159.854ops/s| |4|648.943s|6163.870ops/s|646.279s|6189.277ops/s| |4|658.654s|6072.991ops/s|656.277s|6094.987ops/s| |4|634.568s|6303.501ops/s|669.986s|5970.274ops/s| |5|722.867s|6916.902ops/s|730.954s|6840.376ops/s| |5|731.401s|6836.195ops/s|725.907s|6887.935ops/s| |5|723.812s|6907.871ops/s|718.261s|6961.258ops/s| Scope sequenceid to the region rather than regionserver (WAS: Mutations on Regions in recovery mode might have same sequenceIDs) Key: HBASE-8741 URL: https://issues.apache.org/jira/browse/HBASE-8741 Project: HBase Issue Type: Bug Components: MTTR Affects Versions: 0.95.1 Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Fix For: 0.98.0 Attachments: HBASE-8741-trunk-v6.1-rebased.patch, HBASE-8741-trunk-v6.2.1.patch, HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.2.2.patch, HBASE-8741-trunk-v6.3.patch, HBASE-8741-trunk-v6.patch, HBASE-8741-v0.patch, HBASE-8741-v2.patch, HBASE-8741-v3.patch, HBASE-8741-v4-again.patch, HBASE-8741-v4-again.patch, HBASE-8741-v4.patch, HBASE-8741-v5-again.patch, HBASE-8741-v5.patch Currently, when opening a region, we find the maximum sequence ID from all its HFiles and then set the LogSequenceId of the log (in case the later is at a small value). This works good in recovered.edits case as we are not writing to the region until we have replayed all of its previous edits. With distributed log replay, if we want to enable writes while a region is under recovery, we need to make sure that the logSequenceId maximum logSequenceId of the old regionserver. Otherwise, we might have a situation where new edits have same (or smaller) sequenceIds. We can store region level information in the WALTrailer, than this scenario could be avoided by: a) reading the trailer of the last completed file, i.e., last wal file which has a trailer and, b) completely reading the last wal file (this file would not have the trailer, so it needs to be read completely). In future, if we switch to multi wal file, we could read the trailer for all completed WAL files, and reading the remaining incomplete files. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9865) WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM
[ https://issues.apache.org/jira/browse/HBASE-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812509#comment-13812509 ] Lars Hofhansl commented on HBASE-9865: -- This is not quite right in the partial read failure case, yet. (a log was partially read and is then found corrupted) WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM Key: HBASE-9865 URL: https://issues.apache.org/jira/browse/HBASE-9865 Project: HBase Issue Type: Bug Affects Versions: 0.94.5, 0.95.0 Reporter: churro morales Assignee: Lars Hofhansl Attachments: 9865-0.94-v2.txt, 9865-sample-1.txt, 9865-sample.txt, 9865-trunk-v2.txt, 9865-trunk.txt WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM. A little background on this issue. We noticed that our source replication regionservers would get into gc storms and sometimes even OOM. We noticed a case where it showed that there were around 25k WALEdits to replicate, each one with an ArrayList of KeyValues. The array list had a capacity of around 90k (using 350KB of heap memory) but had around 6 non null entries. When the ReplicationSource.readAllEntriesToReplicateOrNextFile() gets a WALEdit it removes all kv's that are scoped other than local. But in doing so we don't account for the capacity of the ArrayList when determining heapSize for a WALEdit. The logic for shipping a batch is whether you have hit a size capacity or number of entries capacity. Therefore if have a WALEdit with 25k entries and suppose all are removed: The size of the arrayList is 0 (we don't even count the collection's heap size currently) but the capacity is ignored. This will yield a heapSize() of 0 bytes while in the best case it would be at least 10 bytes (provided you pass initialCapacity and you have 32 bit JVM) I have some ideas on how to address this problem and want to know everyone's thoughts: 1. We use a probabalistic counter such as HyperLogLog and create something like: * class CapacityEstimateArrayList implements ArrayList ** this class overrides all additive methods to update the probabalistic counts ** it includes one additional method called estimateCapacity (we would take estimateCapacity - size() and fill in sizes for all references) * Then we can do something like this in WALEdit.heapSize: {code} public long heapSize() { long ret = ClassSize.ARRAYLIST; for (KeyValue kv : kvs) { ret += kv.heapSize(); } long nullEntriesEstimate = kvs.getCapacityEstimate() - kvs.size(); ret += ClassSize.align(nullEntriesEstimate * ClassSize.REFERENCE); if (scopes != null) { ret += ClassSize.TREEMAP; ret += ClassSize.align(scopes.size() * ClassSize.MAP_ENTRY); // TODO this isn't quite right, need help here } return ret; } {code} 2. In ReplicationSource.removeNonReplicableEdits() we know the size of the array originally, and we provide some percentage threshold. When that threshold is met (50% of the entries have been removed) we can call kvs.trimToSize() 3. in the heapSize() method for WALEdit we could use reflection (Please don't shoot me for this) to grab the actual capacity of the list. Doing something like this: {code} public int getArrayListCapacity() { try { Field f = ArrayList.class.getDeclaredField(elementData); f.setAccessible(true); return ((Object[]) f.get(kvs)).length; } catch (Exception e) { log.warn(Exception in trying to get capacity on ArrayList, e); return kvs.size(); } {code} I am partial to (1) using HyperLogLog and creating a CapacityEstimateArrayList, this is reusable throughout the code for other classes that implement HeapSize which contains ArrayLists. The memory footprint is very small and it is very fast. The issue is that this is an estimate, although we can configure the precision we most likely always be conservative. The estimateCapacity will always be less than the actualCapacity, but it will be close. I think that putting the logic in removeNonReplicableEdits will work, but this only solves the heapSize problem in this particular scenario. Solution 3 is slow and horrible but that gives us the exact answer. I would love to hear if anyone else has any other ideas on how to remedy this problem? I have code for trunk and 0.94 for all 3 ideas and can provide a patch if the community thinks any of these approaches is a viable one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gustavo Anatoly updated HBASE-9808: --- Attachment: HBASE-9808-v1.patch org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812531#comment-13812531 ] Hadoop QA commented on HBASE-9808: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611840/HBASE-9808-v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified tests. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestHRegion org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7720//console This message is automatically generated. org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and
[jira] [Commented] (HBASE-9863) Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs
[ https://issues.apache.org/jira/browse/HBASE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812539#comment-13812539 ] Jimmy Xiang commented on HBASE-9863: I think start() should be synchronized too. It's better to make sure it won't be called more than once also. For 'NamespaceDescriptor get(String name)', yes, the semantics is the same. After the change, we don't check if isTableAvailableAndInitialized. So zkNamespaceManager could be uninitialized. Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs --- Key: HBASE-9863 URL: https://issues.apache.org/jira/browse/HBASE-9863 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 9863-v1.txt, 9863-v2.txt, 9863-v3.txt TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry sometimes hung. Here were two recent occurrences: https://builds.apache.org/job/PreCommit-HBASE-Build/7676/console https://builds.apache.org/job/PreCommit-HBASE-Build/7671/console There were 9 occurrences of the following in both stack traces: {code} FifoRpcScheduler.handler1-thread-5 daemon prio=10 tid=0x09df8800 nid=0xc17 waiting for monitor entry [0x6fdf8000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:250) - waiting to lock 0x7f69b5f0 (a org.apache.hadoop.hbase.master.TableNamespaceManager) at org.apache.hadoop.hbase.master.HMaster.isTableNamespaceManagerReady(HMaster.java:3146) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3105) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1743) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1782) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38221) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1983) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92) {code} The test hung here: {code} pool-1-thread-1 prio=10 tid=0x74f7b800 nid=0x5aa5 in Object.wait() [0x74efe000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1436) - locked 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.createTable(MasterProtos.java:40372) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.createTable(HConnectionManager.java:1931) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:598) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3124) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:485) at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:486) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9863) Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs
[ https://issues.apache.org/jira/browse/HBASE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-9863: -- Attachment: 9863-v4.txt Patch v4 incorporates Jimmy's comments above. Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs --- Key: HBASE-9863 URL: https://issues.apache.org/jira/browse/HBASE-9863 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 9863-v1.txt, 9863-v2.txt, 9863-v3.txt, 9863-v4.txt TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry sometimes hung. Here were two recent occurrences: https://builds.apache.org/job/PreCommit-HBASE-Build/7676/console https://builds.apache.org/job/PreCommit-HBASE-Build/7671/console There were 9 occurrences of the following in both stack traces: {code} FifoRpcScheduler.handler1-thread-5 daemon prio=10 tid=0x09df8800 nid=0xc17 waiting for monitor entry [0x6fdf8000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:250) - waiting to lock 0x7f69b5f0 (a org.apache.hadoop.hbase.master.TableNamespaceManager) at org.apache.hadoop.hbase.master.HMaster.isTableNamespaceManagerReady(HMaster.java:3146) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3105) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1743) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1782) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38221) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1983) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92) {code} The test hung here: {code} pool-1-thread-1 prio=10 tid=0x74f7b800 nid=0x5aa5 in Object.wait() [0x74efe000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1436) - locked 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.createTable(MasterProtos.java:40372) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.createTable(HConnectionManager.java:1931) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:598) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3124) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:485) at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:486) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812556#comment-13812556 ] Lars Hofhansl commented on HBASE-8942: -- Checked the 0.94 code. Should be safe there as well. Good find. DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812557#comment-13812557 ] Lars Hofhansl commented on HBASE-8942: -- Committed to 0.94 as well. DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-8942: - Fix Version/s: 0.94.14 DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9808) org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation
[ https://issues.apache.org/jira/browse/HBASE-9808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812585#comment-13812585 ] Ted Yu commented on HBASE-9808: --- Test failures were not related to the patch. org.apache.hadoop.hbase.rest.PerformanceEvaluation is out of sync with org.apache.hadoop.hbase.PerformanceEvaluation Key: HBASE-9808 URL: https://issues.apache.org/jira/browse/HBASE-9808 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Gustavo Anatoly Attachments: HBASE-9808-v1.patch, HBASE-9808.patch Here is list of JIRAs whose fixes might have gone into rest.PerformanceEvaluation : {code} r1527817 | mbertozzi | 2013-09-30 15:57:44 -0700 (Mon, 30 Sep 2013) | 1 line HBASE-9663 PerformanceEvaluation does not properly honor specified table name parameter r1526452 | mbertozzi | 2013-09-26 04:58:50 -0700 (Thu, 26 Sep 2013) | 1 line HBASE-9662 PerformanceEvaluation input do not handle tags properties r1525269 | ramkrishna | 2013-09-21 11:01:32 -0700 (Sat, 21 Sep 2013) | 3 lines HBASE-8496 - Implement tags and the internals of how a tag should look like (Ram) r1524985 | nkeywal | 2013-09-20 06:02:54 -0700 (Fri, 20 Sep 2013) | 1 line HBASE-9558 PerformanceEvaluation is in hbase-server, and creates a dependency to MiniDFSCluster r1523782 | nkeywal | 2013-09-16 13:07:13 -0700 (Mon, 16 Sep 2013) | 1 line HBASE-9521 clean clearBufferOnFail behavior and deprecate it r1518341 | jdcryans | 2013-08-28 12:46:55 -0700 (Wed, 28 Aug 2013) | 2 lines HBASE-9330 Refactor PE to create HTable the correct way {code} Long term, we may consider consolidating the two PerformanceEvaluation classes so that such maintenance work can be reduced. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9863) Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs
[ https://issues.apache.org/jira/browse/HBASE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812587#comment-13812587 ] Hadoop QA commented on HBASE-9863: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12611850/9863-v4.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop 1.0 profile. {color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop 2.0 profile. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:red}-1 findbugs{color}. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:red}-1 site{color}. The patch appears to cause mvn site goal to fail. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestHRegion org.apache.hadoop.hbase.regionserver.TestHRegionBusyWait {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:488) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/7721//console This message is automatically generated. Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs --- Key: HBASE-9863 URL: https://issues.apache.org/jira/browse/HBASE-9863 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 9863-v1.txt, 9863-v2.txt, 9863-v3.txt, 9863-v4.txt TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry sometimes hung. Here were two recent occurrences: https://builds.apache.org/job/PreCommit-HBASE-Build/7676/console https://builds.apache.org/job/PreCommit-HBASE-Build/7671/console There were 9 occurrences of the following in both stack traces: {code} FifoRpcScheduler.handler1-thread-5 daemon prio=10 tid=0x09df8800 nid=0xc17 waiting for monitor entry [0x6fdf8000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:250) - waiting to lock 0x7f69b5f0 (a org.apache.hadoop.hbase.master.TableNamespaceManager) at org.apache.hadoop.hbase.master.HMaster.isTableNamespaceManagerReady(HMaster.java:3146) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3105) at
[jira] [Commented] (HBASE-9863) Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs
[ https://issues.apache.org/jira/browse/HBASE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812588#comment-13812588 ] Ted Yu commented on HBASE-9863: --- Same test failure appeared in QA run for HBASE-9808 I don't think it was caused by my patch. Intermittently TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry hangs --- Key: HBASE-9863 URL: https://issues.apache.org/jira/browse/HBASE-9863 Project: HBase Issue Type: Bug Reporter: Ted Yu Assignee: Ted Yu Attachments: 9863-v1.txt, 9863-v2.txt, 9863-v3.txt, 9863-v4.txt TestZooKeeper#testRegionAssignmentAfterMasterRecoveryDueToZKExpiry sometimes hung. Here were two recent occurrences: https://builds.apache.org/job/PreCommit-HBASE-Build/7676/console https://builds.apache.org/job/PreCommit-HBASE-Build/7671/console There were 9 occurrences of the following in both stack traces: {code} FifoRpcScheduler.handler1-thread-5 daemon prio=10 tid=0x09df8800 nid=0xc17 waiting for monitor entry [0x6fdf8000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.master.TableNamespaceManager.isTableAvailableAndInitialized(TableNamespaceManager.java:250) - waiting to lock 0x7f69b5f0 (a org.apache.hadoop.hbase.master.TableNamespaceManager) at org.apache.hadoop.hbase.master.HMaster.isTableNamespaceManagerReady(HMaster.java:3146) at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3105) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1743) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1782) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:38221) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:1983) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:92) {code} The test hung here: {code} pool-1-thread-1 prio=10 tid=0x74f7b800 nid=0x5aa5 in Object.wait() [0x74efe000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1436) - locked 0xcc848348 (a org.apache.hadoop.hbase.ipc.RpcClient$Call) at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1654) at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1712) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.createTable(MasterProtos.java:40372) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$5.createTable(HConnectionManager.java:1931) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:598) at org.apache.hadoop.hbase.client.HBaseAdmin$2.call(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:116) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:94) - locked 0x7faa26d0 (a org.apache.hadoop.hbase.client.RpcRetryingCaller) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3124) at org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:594) at org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:485) at org.apache.hadoop.hbase.TestZooKeeper.testRegionAssignmentAfterMasterRecoveryDueToZKExpiry(TestZooKeeper.java:486) {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9681) Basic codec negotiation
[ https://issues.apache.org/jira/browse/HBASE-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812602#comment-13812602 ] Anoop Sam John commented on HBASE-9681: --- Ram So it will be like the server will always NOT write back the cell tags? Or that also based on some context information? Basic codec negotiation --- Key: HBASE-9681 URL: https://issues.apache.org/jira/browse/HBASE-9681 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Basic codec negotiation: There should be a default codec used for cell encoding over the RPC connection. This should be configurable in the site file. The client can optionally send a message, a manufactured call that would otherwise be invalid in some way, to the server asking for a list of supported cell codecs. An older server should simply send back an error because the request is invalid except to servers supporting this feature. A server supporting this feature should send back the requested information or an error indication if something went wrong. The client can optionally send a message, a manufactured call that would otherwise be invalid in some way, to the server asking for it to use a given codec for all further communication. Otherwise the server will continue to use the default codec. The server will send back a call response acknowledging the change or an error indication if the request cannot be honored. Server configuration should support mappings from one codec type to another. We need to handle the case where the server has a codec available that extends the requested type but overrides some behavior in the base class, and this is what should be used in lieu of the base type. It must also be possible to choose an alternate default codec which stands in for the default codec, is compatible with client expectations, but changes the server side behavior as needed in the absence of negotiation. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Comment Edited] (HBASE-9681) Basic codec negotiation
[ https://issues.apache.org/jira/browse/HBASE-9681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812602#comment-13812602 ] Anoop Sam John edited comment on HBASE-9681 at 11/4/13 4:13 AM: Ram So it will be like the server will always NOT write back the cell tags? Or that also based on some context information? There is Export tool which uses MR based scan. In this case the server should serialize back the tags also? was (Author: anoop.hbase): Ram So it will be like the server will always NOT write back the cell tags? Or that also based on some context information? Basic codec negotiation --- Key: HBASE-9681 URL: https://issues.apache.org/jira/browse/HBASE-9681 Project: HBase Issue Type: Sub-task Affects Versions: 0.98.0 Reporter: Andrew Purtell Basic codec negotiation: There should be a default codec used for cell encoding over the RPC connection. This should be configurable in the site file. The client can optionally send a message, a manufactured call that would otherwise be invalid in some way, to the server asking for a list of supported cell codecs. An older server should simply send back an error because the request is invalid except to servers supporting this feature. A server supporting this feature should send back the requested information or an error indication if something went wrong. The client can optionally send a message, a manufactured call that would otherwise be invalid in some way, to the server asking for it to use a given codec for all further communication. Otherwise the server will continue to use the default codec. The server will send back a call response acknowledging the change or an error indication if the request cannot be honored. Server configuration should support mappings from one codec type to another. We need to handle the case where the server has a codec available that extends the requested type but overrides some behavior in the base class, and this is what should be used in lieu of the base type. It must also be possible to choose an alternate default codec which stands in for the default codec, is compatible with client expectations, but changes the server side behavior as needed in the absence of negotiation. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9872) ModifyTable does not modify the attributes of a newly modified/changed ColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812609#comment-13812609 ] ramkrishna.s.vasudevan commented on HBASE-9872: --- Found that the modifyColumn or modifyTable works fine even if HCD is changed in modifyTable. We have some internal code in that this does not seem to work. Let me check on that and then see on what to do with this. ModifyTable does not modify the attributes of a newly modified/changed ColumnDescriptor Key: HBASE-9872 URL: https://issues.apache.org/jira/browse/HBASE-9872 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0, 0.96.1, 0.94.14 This issue (if it is an expected behaviour I can close this) exists in all versions. If i do modifyColumn and change an HCDs parameter I am able to get back the modified HCD with the latest data. But when i do modifyTable and in that modify an HCD parameter say for eg. the SCOPE of it then as we don't persist the HCD information as in TableModifyFamilyHandler used for modifycolumn {code} HTableDescriptor htd = this.masterServices.getMasterFileSystem().modifyColumn(tableName, familyDesc); {code} we are not able to get the updated HCD information on the RegionServer. So incases of replication where I need to modify the HCD's scope we are not able to make the replication happen. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812610#comment-13812610 ] Hudson commented on HBASE-8942: --- FAILURE: Integrated in HBase-0.94-security #328 (See [https://builds.apache.org/job/HBase-0.94-security/328/]) HBASE-8942 DFS errors during a read operation (get/scan), may cause write outliers (Amitanand Aiyer) (larsh: rev 1538484) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9816) Address review comments in HBASE-8496
[ https://issues.apache.org/jira/browse/HBASE-9816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812611#comment-13812611 ] ramkrishna.s.vasudevan commented on HBASE-9816: --- Thanks Stack for the reviews. I am not able to open a new RB request even now. Will fix the comments. Address review comments in HBASE-8496 - Key: HBASE-9816 URL: https://issues.apache.org/jira/browse/HBASE-9816 Project: HBase Issue Type: Bug Affects Versions: 0.98.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0 Attachments: HBASE-9816.patch, HBASE-9816_1.patch, HBASE-9816_1.patch This JIRA would be used to address the review comments in HBASE-8496. Any more comments would be addressed and committed as part of this. There are already few comments from Stack on the RB. https://reviews.apache.org/r/13311/ -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9874) Append and Increment operation drops Tags
[ https://issues.apache.org/jira/browse/HBASE-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812619#comment-13812619 ] ramkrishna.s.vasudevan commented on HBASE-9874: --- I had a similar impl in my old patches. Patch looks good. YEs if the CP needs control on what to be done then we need the hook alone, but cannot depend on the CP always here. Append and Increment operation drops Tags - Key: HBASE-9874 URL: https://issues.apache.org/jira/browse/HBASE-9874 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.98.0 Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.98.0 Attachments: AccessController.postMutationBeforeWAL.txt, HBASE-9874.patch, HBASE-9874_V2.patch, HBASE-9874_V3.patch We should consider tags in the existing cells as well as tags coming in the cells within Increment/Append -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers
[ https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812621#comment-13812621 ] Hudson commented on HBASE-8942: --- SUCCESS: Integrated in HBase-0.94 #1194 (See [https://builds.apache.org/job/HBase-0.94/1194/]) HBASE-8942 DFS errors during a read operation (get/scan), may cause write outliers (Amitanand Aiyer) (larsh: rev 1538484) * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java DFS errors during a read operation (get/scan), may cause write outliers --- Key: HBASE-8942 URL: https://issues.apache.org/jira/browse/HBASE-8942 Project: HBase Issue Type: Bug Affects Versions: 0.89-fb, 0.95.2 Reporter: Amitanand Aiyer Assignee: Amitanand Aiyer Priority: Minor Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14 Attachments: 8942.096.txt, HBase-8942.txt This is a similar issue as discussed in HBASE-8228 1) A scanner holds the Store.ReadLock() while opening the store files ... encounters errors. Thus, takes a long time to finish. 2) A flush is completed, in the mean while. It needs the write lock to commit(), and update scanners. Hence ends up waiting. 3+) All Puts (and also Gets) to the CF, which will need a read lock, will have to wait for 1) and 2) to complete. Thus blocking updates to the system for the DFS timeout. Fix: Open Store files outside the read lock. getScanners() already tries to do this optimisation. However, Store.getScanner() which calls this functions through the StoreScanner constructor, redundantly tries to grab the readLock. Causing the readLock to be held while the storeFiles are being opened, and seeked. We should get rid of the readLock() in Store.getScanner(). This is not required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). This has the required locking already. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9865) WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM
[ https://issues.apache.org/jira/browse/HBASE-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812626#comment-13812626 ] Lars Hofhansl commented on HBASE-9865: -- I'm trying to grok the details of the failure logic, this has gotten pretty convoluted over time. Specifically this part in ReplicationSource.run(): {code} try { if (readAllEntriesToReplicateOrNextFile(currentWALisBeingWrittenTo)) { continue; } } catch (IOException ioe) { ... if (this.replicationQueueInfo.isQueueRecovered()) { ... considerDumping = true; ... } else if (currentNbEntries != 0) { ... considerDumping = true; currentNbEntries = 0; } ... } finally { {code} So when we find a corrupt log file we won't replicate any of it ({{currentNbEntries = 0}}), unless the queue was recovered, in which case we *do* want to replicate the partial set of edits we managed to read? WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM Key: HBASE-9865 URL: https://issues.apache.org/jira/browse/HBASE-9865 Project: HBase Issue Type: Bug Affects Versions: 0.94.5, 0.95.0 Reporter: churro morales Assignee: Lars Hofhansl Attachments: 9865-0.94-v2.txt, 9865-sample-1.txt, 9865-sample.txt, 9865-trunk-v2.txt, 9865-trunk.txt WALEdit.heapSize() is incorrect in certain replication scenarios which may cause RegionServers to go OOM. A little background on this issue. We noticed that our source replication regionservers would get into gc storms and sometimes even OOM. We noticed a case where it showed that there were around 25k WALEdits to replicate, each one with an ArrayList of KeyValues. The array list had a capacity of around 90k (using 350KB of heap memory) but had around 6 non null entries. When the ReplicationSource.readAllEntriesToReplicateOrNextFile() gets a WALEdit it removes all kv's that are scoped other than local. But in doing so we don't account for the capacity of the ArrayList when determining heapSize for a WALEdit. The logic for shipping a batch is whether you have hit a size capacity or number of entries capacity. Therefore if have a WALEdit with 25k entries and suppose all are removed: The size of the arrayList is 0 (we don't even count the collection's heap size currently) but the capacity is ignored. This will yield a heapSize() of 0 bytes while in the best case it would be at least 10 bytes (provided you pass initialCapacity and you have 32 bit JVM) I have some ideas on how to address this problem and want to know everyone's thoughts: 1. We use a probabalistic counter such as HyperLogLog and create something like: * class CapacityEstimateArrayList implements ArrayList ** this class overrides all additive methods to update the probabalistic counts ** it includes one additional method called estimateCapacity (we would take estimateCapacity - size() and fill in sizes for all references) * Then we can do something like this in WALEdit.heapSize: {code} public long heapSize() { long ret = ClassSize.ARRAYLIST; for (KeyValue kv : kvs) { ret += kv.heapSize(); } long nullEntriesEstimate = kvs.getCapacityEstimate() - kvs.size(); ret += ClassSize.align(nullEntriesEstimate * ClassSize.REFERENCE); if (scopes != null) { ret += ClassSize.TREEMAP; ret += ClassSize.align(scopes.size() * ClassSize.MAP_ENTRY); // TODO this isn't quite right, need help here } return ret; } {code} 2. In ReplicationSource.removeNonReplicableEdits() we know the size of the array originally, and we provide some percentage threshold. When that threshold is met (50% of the entries have been removed) we can call kvs.trimToSize() 3. in the heapSize() method for WALEdit we could use reflection (Please don't shoot me for this) to grab the actual capacity of the list. Doing something like this: {code} public int getArrayListCapacity() { try { Field f = ArrayList.class.getDeclaredField(elementData); f.setAccessible(true); return ((Object[]) f.get(kvs)).length; } catch (Exception e) { log.warn(Exception in trying to get capacity on ArrayList, e); return kvs.size(); } {code} I am partial to (1) using HyperLogLog and creating a CapacityEstimateArrayList, this is reusable throughout the code for other classes that implement HeapSize which contains ArrayLists. The memory footprint is very small and it is very fast. The issue is that
[jira] [Resolved] (HBASE-9872) ModifyTable does not modify the attributes of a newly modified/changed ColumnDescriptor
[ https://issues.apache.org/jira/browse/HBASE-9872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan resolved HBASE-9872. --- Resolution: Won't Fix Even modifyTable is also able to change the column descriptor. Not an issue. It was an issue with our internal code. ModifyTable does not modify the attributes of a newly modified/changed ColumnDescriptor Key: HBASE-9872 URL: https://issues.apache.org/jira/browse/HBASE-9872 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 0.98.0, 0.96.1, 0.94.14 This issue (if it is an expected behaviour I can close this) exists in all versions. If i do modifyColumn and change an HCDs parameter I am able to get back the modified HCD with the latest data. But when i do modifyTable and in that modify an HCD parameter say for eg. the SCOPE of it then as we don't persist the HCD information as in TableModifyFamilyHandler used for modifycolumn {code} HTableDescriptor htd = this.masterServices.getMasterFileSystem().modifyColumn(tableName, familyDesc); {code} we are not able to get the updated HCD information on the RegionServer. So incases of replication where I need to modify the HCD's scope we are not able to make the replication happen. -- This message was sent by Atlassian JIRA (v6.1#6144)