[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530777#comment-13530777
 ] 

Hadoop QA commented on HBASE-6466:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560308/HBASE-6466-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
105 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 23 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3522//console

This message is automatically generated.

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4676) Prefix Compression - Trie data block encoding

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530796#comment-13530796
 ] 

Hadoop QA commented on HBASE-4676:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12560147/HBASE-4676-prefix-tree-trunk-v8.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 129 
new or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
116 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 62 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3525//console

This message is automatically generated.

 Prefix Compression - Trie data block encoding
 -

 Key: HBASE-4676
 URL: https://issues.apache.org/jira/browse/HBASE-4676
 Project: HBase
  Issue Type: New Feature
  Components: io, Performance, regionserver
Affects Versions: 0.96.0
Reporter: Matt Corgan
Assignee: Matt Corgan
 Attachments: HBASE-4676-0.94-v1.patch, 
 HBASE-4676-common-and-server-v8.patch, HBASE-4676-prefix-tree-trunk-v1.patch, 
 HBASE-4676-prefix-tree-trunk-v2.patch, HBASE-4676-prefix-tree-trunk-v3.patch, 
 HBASE-4676-prefix-tree-trunk-v4.patch, HBASE-4676-prefix-tree-trunk-v5.patch, 
 HBASE-4676-prefix-tree-trunk-v6.patch, HBASE-4676-prefix-tree-trunk-v7.patch, 
 HBASE-4676-prefix-tree-trunk-v8.patch, hbase-prefix-trie-0.1.jar, 
 PrefixTrie_Format_v1.pdf, PrefixTrie_Performance_v1.pdf, SeeksPerSec by 
 blockSize.png


 The HBase data block format has room for 2 significant improvements for 
 applications that have high block cache hit ratios.  
 First, there is no prefix compression, and the current KeyValue format is 
 somewhat metadata heavy, so there can be tremendous memory bloat for many 
 common data layouts, specifically those with long keys and short values.
 Second, there is no random access to KeyValues inside data blocks.  This 
 means that every time you double the datablock size, average seek time (or 
 average cpu consumption) goes up by a factor of 2.  The standard 64KB block 
 size is ~10x slower for random seeks than a 4KB block size, but block sizes 
 as small as 4KB cause problems elsewhere.  Using block sizes of 256KB or 1MB 
 or more may be more efficient from a disk access and block-cache perspective 
 in many big-data applications, but doing so is infeasible from a random seek 
 perspective.
 The PrefixTrie block encoding format attempts to solve both of these 
 problems.  Some features:
 * trie format for row key encoding completely eliminates duplicate row keys 
 and encodes similar row keys into a standard trie structure which also saves 
 a lot of space
 * the column family is currently stored once at the beginning of each block.  
 this could easily be modified to allow multiple family names per block
 * all 

[jira] [Commented] (HBASE-1212) merge tool expects regions all have different sequence ids

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530802#comment-13530802
 ] 

Hadoop QA commented on HBASE-1212:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12556386/HBASE-1212-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
104 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 23 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3526//console

This message is automatically generated.

 merge tool expects regions all have different sequence ids
 --

 Key: HBASE-1212
 URL: https://issues.apache.org/jira/browse/HBASE-1212
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: Jean-Marc Spaggiari
 Attachments: failure.log, HBASE-1212.patch, HBASE-1212-v2.patch


 Currently merging two regions, the merge tool will compare their sequence 
 ids.  If same, it will decrement one.  It needs to do this because on region 
 open, files are keyed by their sequenceid; if two the same, one will erase 
 the other.
 Well, with the move to the aggregating hfile format, the sequenceid is 
 written when the file is created and its no longer written into an aside file 
 but as metadata on to the end of the file.  Changing the sequenceid is no 
 longer an option.
 This issue is about figuring a solution for the rare case where two store 
 files have same sequence id AND we want to merge the two regions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer

2012-12-13 Thread Adrian Muraru (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530904#comment-13530904
 ] 

Adrian Muraru commented on HBASE-7205:
--

I'm 100% sure it works - but I wouldn't swear ever for a piece of software :D

 Coprocessor classloader is replicated for all regions in the HRegionServer
 --

 Key: HBASE-7205
 URL: https://issues.apache.org/jira/browse/HBASE-7205
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.92.2, 0.94.2
Reporter: Adrian Muraru
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7205-0.94.txt, 7205-v10.txt, 7205-v1.txt, 7205-v3.txt, 
 7205-v4.txt, 7205-v5.txt, 7205-v6.txt, 7205-v7.txt, 7205-v8.txt, 7205-v9.txt, 
 HBASE-7205_v2.patch


 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the 
 coprocessor classes and a new instance of this CL is created for each single 
 HRegion opened. This leads to OOME-PermGen when the number of regions go 
 above hundres / region server. 
 Having the table coprocessor jailed in a separate classloader is good however 
 we should create only one for all regions of a table in each HRS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7243) Test for creating a large number of regions

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530933#comment-13530933
 ] 

Hudson commented on HBASE-7243:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #294 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/294/])
HBASE-7243 Test for creating a large number of regions (Revision 1421039)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-it/src/test/java/org/apache/hadoop/hbase/IntegrationTestManyRegions.java


 Test for creating a large number of regions
 ---

 Key: HBASE-7243
 URL: https://issues.apache.org/jira/browse/HBASE-7243
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment, regionserver, test
Reporter: Enis Soztutar
Assignee: Nick Dimiduk
  Labels: noob
 Fix For: 0.96.0

 Attachments: 7243-integration-test-many-splits.diff, 
 7243-integration-test-many-splits.diff, 7243-integration-test-many-splits.diff


 After HBASE-7220, I think it will be good to write a unit test/IT to create a 
 large number of regions. We can put a reasonable timeout to the test. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7211) Improve hbase ref guide for the testing part.

2012-12-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531017#comment-13531017
 ] 

nkeywal commented on HBASE-7211:


Committed. Thanks, Jeffrey!
I removed as well a reference to localTests in zookeeper.xml.

Our last issue (SUREFIRE-800) is now tagged as fixed in the next (2.13) 
surefire release, so I'm going to be optimistic and wait for it before updating 
the remaining part...

 Improve hbase ref guide for the testing part.
 -

 Key: HBASE-7211
 URL: https://issues.apache.org/jira/browse/HBASE-7211
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: hbase-7211-partial.patch


 Here is some stuff I saw. I will propose a fix in a week or so, please add 
 the comment or issues you have in mind.
 ??15.6.1. Apache HBase Modules??
 = We should be able to use categories in all modules. The default should be 
 small; but any test manipulating the time needs to be in a specific jvm 
 (hence medium), so it's not always related to minicluster.
 ??15.6.3.6. hbasetests.sh??
 = We can remove this chapter, and the script
  The script is not totally useless, but I think nobody actually uses it.
 = Add a chapter on flakiness.
 Some tests are, unfortunately, flaky. While there number decreases, we still 
 have some. Rules are:
 - don't write flaky tests! :-)
 - small tests cannot be flaky, as it blocks other test execution. Corollary: 
 if you have an issue with a small test, it's either your environment either a 
 severe issue.
 - rerun the test a few time to validate, check the ports and file descriptors 
 used.
 ??mvn test -P localTests -Dtest=MyTest??
 = We could actually activate the localTests profile whenever -Dtest is used. 
 If we do that, we can remove the reference from localTests in the doc.
 ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests??
 = I'm not sure it's actually used. We could remove them from the pom.xml 
 (and the doc).
 ??The HBase build uses a patched version of the maven surefire plugin?? 
 = Hopefully, we will be able to remove this soon :-)
  ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION??
  = Should be documented

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531019#comment-13531019
 ] 

Hadoop QA commented on HBASE-6651:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560716/HBASE-6651-V10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 19 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
114 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 20 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3527//console

This message is automatically generated.

 Improve thread safety of HTablePool
 ---

 Key: HBASE-6651
 URL: https://issues.apache.org/jira/browse/HBASE-6651
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.1
Reporter: Hiroshi Ikeda
Assignee: Hiroshi Ikeda
 Fix For: 0.96.0

 Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, 
 HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, 
 HBASE-6651-V5.patch, HBASE-6651-V6.patch, HBASE-6651-V7.patch, 
 HBASE-6651-V8.patch, HBASE-6651-V9.patch, sample.zip, sample.zip, 
 sharedmap_for_hbaseclient.zip


 There are some operations in HTablePool accessing PoolMap in multiple places 
 without any explicit synchronization. 
 For example HTablePool.closeTablePool() calls PoolMap.values(), and calls 
 PoolMap.remove(). If other threads add new instances to the pool in the 
 middle of the calls, the newly added instances might be dropped. 
 (HTablePool.closeTablePool() also has another problem that calling it by 
 multiple threads causes accessing HTable by multiple threads.)
 Moreover, PoolMap is not thread safe for the same reason.
 For example PoolMap.put() calles ConcurrentMap.get() and calles 
 ConcurrentMap.put(). If other threads add a new instance to the concurent map 
 in the middle of the calls, the new instance might be dropped.
 And also implementations of Pool have the same problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531051#comment-13531051
 ] 

Ted Yu commented on HBASE-6466:
---

+1 on patch v4.

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531052#comment-13531052
 ] 

Ted Yu commented on HBASE-7340:
---

For HMaster.balance(), we first unassign the regions. For postBalance() call, 
the regions may not have time to reach destination. I think in this case 
postBalance() just provides some hint as for the final destination.

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu

 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531053#comment-13531053
 ] 

Ted Yu commented on HBASE-6651:
---

bq. HTablePool is annotated as stable.
+1 on keeping HTablePool in 0.96

 Improve thread safety of HTablePool
 ---

 Key: HBASE-6651
 URL: https://issues.apache.org/jira/browse/HBASE-6651
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.1
Reporter: Hiroshi Ikeda
Assignee: Hiroshi Ikeda
 Fix For: 0.96.0

 Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, 
 HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, 
 HBASE-6651-V5.patch, HBASE-6651-V6.patch, HBASE-6651-V7.patch, 
 HBASE-6651-V8.patch, HBASE-6651-V9.patch, sample.zip, sample.zip, 
 sharedmap_for_hbaseclient.zip


 There are some operations in HTablePool accessing PoolMap in multiple places 
 without any explicit synchronization. 
 For example HTablePool.closeTablePool() calls PoolMap.values(), and calls 
 PoolMap.remove(). If other threads add new instances to the pool in the 
 middle of the calls, the newly added instances might be dropped. 
 (HTablePool.closeTablePool() also has another problem that calling it by 
 multiple threads causes accessing HTable by multiple threads.)
 Moreover, PoolMap is not thread safe for the same reason.
 For example PoolMap.put() calles ConcurrentMap.get() and calles 
 ConcurrentMap.put(). If other threads add a new instance to the concurent map 
 in the middle of the calls, the new instance might be dropped.
 And also implementations of Pool have the same problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7344) subprocedure initialization fails with invalid znode data.

2012-12-13 Thread Jonathan Hsieh (JIRA)
Jonathan Hsieh created HBASE-7344:
-

 Summary: subprocedure initialization fails with invalid znode data.
 Key: HBASE-7344
 URL: https://issues.apache.org/jira/browse/HBASE-7344
 Project: HBase
  Issue Type: Sub-task
Reporter: Jonathan Hsieh


Sometimes snapshots subprocedures fail to start on RS because data read from ZK 
is bad.  

{code}
2012-12-13 07:22:55,238 ERROR 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Illegal argument 
exception
java.lang.IllegalArgumentException: Could not read snapshot information from 
request.
at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapsh
otManager.java:284)
at 
org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
2012-12-13 07:22:55,239 ERROR 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Failed due to null 
subprocedure
Local ForeignThreadException from null
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
Caused by: java.lang.IllegalArgumentException: Could not read snapshot 
information from request.
at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapshotManager.java:284)
at 
org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199)
... 6 more
2012-12-13 07:22:55,239 ERROR org.apache.zookeeper.ClientCnxn: Error while 
calling watcher 
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAborted(ZKProcedureMemberRpcs.java:266)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
at 
org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7211) Improve hbase ref guide for the testing part.

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531107#comment-13531107
 ] 

Hudson commented on HBASE-7211:
---

Integrated in HBase-TRUNK #3617 (See 
[https://builds.apache.org/job/HBase-TRUNK/3617/])
HBASE-7211 Improve hbase ref guide for the testing part. - 1st part 
(Jeffrey Zhong) (Revision 1421295)

 Result = FAILURE
nkeywal : 
Files : 
* /hbase/trunk/pom.xml
* /hbase/trunk/src/docbkx/developer.xml
* /hbase/trunk/src/docbkx/zookeeper.xml


 Improve hbase ref guide for the testing part.
 -

 Key: HBASE-7211
 URL: https://issues.apache.org/jira/browse/HBASE-7211
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: hbase-7211-partial.patch


 Here is some stuff I saw. I will propose a fix in a week or so, please add 
 the comment or issues you have in mind.
 ??15.6.1. Apache HBase Modules??
 = We should be able to use categories in all modules. The default should be 
 small; but any test manipulating the time needs to be in a specific jvm 
 (hence medium), so it's not always related to minicluster.
 ??15.6.3.6. hbasetests.sh??
 = We can remove this chapter, and the script
  The script is not totally useless, but I think nobody actually uses it.
 = Add a chapter on flakiness.
 Some tests are, unfortunately, flaky. While there number decreases, we still 
 have some. Rules are:
 - don't write flaky tests! :-)
 - small tests cannot be flaky, as it blocks other test execution. Corollary: 
 if you have an issue with a small test, it's either your environment either a 
 severe issue.
 - rerun the test a few time to validate, check the ports and file descriptors 
 used.
 ??mvn test -P localTests -Dtest=MyTest??
 = We could actually activate the localTests profile whenever -Dtest is used. 
 If we do that, we can remove the reference from localTests in the doc.
 ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests??
 = I'm not sure it's actually used. We could remove them from the pom.xml 
 (and the doc).
 ??The HBase build uses a patched version of the maven surefire plugin?? 
 = Hopefully, we will be able to remove this soon :-)
  ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION??
  = Should be documented

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7344) subprocedure initialization fails with invalid znode data.

2012-12-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531109#comment-13531109
 ] 

Jonathan Hsieh commented on HBASE-7344:
---

This makes taking a ss impossible until the timeout occurs.

 subprocedure initialization fails with invalid znode data.
 --

 Key: HBASE-7344
 URL: https://issues.apache.org/jira/browse/HBASE-7344
 Project: HBase
  Issue Type: Sub-task
Reporter: Jonathan Hsieh

 Sometimes snapshots subprocedures fail to start on RS because data read from 
 ZK is bad.  
 {code}
 2012-12-13 07:22:55,238 ERROR 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Illegal argument 
 exception
 java.lang.IllegalArgumentException: Could not read snapshot information from 
 request.
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapsh
 otManager.java:284)
 at 
 org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 2012-12-13 07:22:55,239 ERROR 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs: Failed due to null 
 subprocedure
 Local ForeignThreadException from null
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 Caused by: java.lang.IllegalArgumentException: Could not read snapshot 
 information from request.
 at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedureBuilder.buildSubprocedure(RegionServerSnapshotManager.java:284)
 at 
 org.apache.hadoop.hbase.procedure.ProcedureMember.createSubprocedure(ProcedureMember.java:98)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:199)
 ... 6 more
 2012-12-13 07:22:55,239 ERROR org.apache.zookeeper.ClientCnxn: Error while 
 calling watcher 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.sendMemberAborted(ZKProcedureMemberRpcs.java:266)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.startNewSubprocedure(ZKProcedureMemberRpcs.java:203)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.waitForNewProcedures(ZKProcedureMemberRpcs.java:167)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs.access$1(ZKProcedureMemberRpcs.java:150)
 at 
 org.apache.hadoop.hbase.procedure.ZKProcedureMemberRpcs$1.nodeChildrenChanged(ZKProcedureMemberRpcs.java:106)
 at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:303)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
 at 
 org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7345) subprocedure zk info should be dumpable from the shell

2012-12-13 Thread Jonathan Hsieh (JIRA)
Jonathan Hsieh created HBASE-7345:
-

 Summary: subprocedure zk info should be dumpable from the shell
 Key: HBASE-7345
 URL: https://issues.apache.org/jira/browse/HBASE-7345
 Project: HBase
  Issue Type: Sub-task
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh


For debugging by admins, we should include the ability to dump subprocedure 
information either as part of the hbase shell's zk_dump or via some new 
command.  It should include all the status of the different procedure portions 
and include timestamp information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7290) Online snapshots

2012-12-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531133#comment-13531133
 ] 

Jonathan Hsieh commented on HBASE-7290:
---

When I say 'restore' in step 4, I am specifically referring to a 'snapshot 
restore' operation.  It doesn't necessarily have to happen after a crash.  
Suppose a user creates data, snapshots it, then puts bad data in.  The use 
wants the old data back so he restores the snapshot (there is no crash anywhere 
here). 

While what you suggest is true, (they could restore again), this is unexpected, 
even in our weaker consistency model.  If we have a table, have a crash and 
come back we essentially expect the table to be essentially the same as before. 
(without potentially tons of new data appear.  



 Online snapshots 
 -

 Key: HBASE-7290
 URL: https://issues.apache.org/jira/browse/HBASE-7290
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh

 HBASE-6055 will be closed when the offline snapshots pieces get merged with 
 trunk.  This umbrella issue has all the online snapshot specific patches.  
 This will get merged once one of the implementations makes it into trunk.  
 Other flavors of online snapshots can then be done as normal patches instead 
 of on a development branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7290) Online snapshots

2012-12-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531135#comment-13531135
 ] 

Jonathan Hsieh commented on HBASE-7290:
---

Let's call this problem the 'snapshot replay problem', and move discussion to a 
new issue.

 Online snapshots 
 -

 Key: HBASE-7290
 URL: https://issues.apache.org/jira/browse/HBASE-7290
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh

 HBASE-6055 will be closed when the offline snapshots pieces get merged with 
 trunk.  This umbrella issue has all the online snapshot specific patches.  
 This will get merged once one of the implementations makes it into trunk.  
 Other flavors of online snapshots can then be done as normal patches instead 
 of on a development branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HBASE-7339) Splitting a hfilelink causes region servers to go down.

2012-12-13 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-7339 started by Jonathan Hsieh.

 Splitting a hfilelink causes region servers to go down.
 ---

 Key: HBASE-7339
 URL: https://issues.apache.org/jira/browse/HBASE-7339
 Project: HBase
  Issue Type: Sub-task
  Components: snapshots
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Assignee: Jonathan Hsieh
Priority: Blocker
 Fix For: hbase-6055


 Steps:
 - Have a single region table t with 15 hfiles in it.
 - Snapshot it. (was done using online snapshot from HBASE-7321)
 - Clone a snapshot to table t'. 
 - t' has its region do a post-open task that attempts to compact region.  
 policy does not compact all files. (default seems to be 10)
 - after compaction we have hfile links and real hfiles mixed in the region
 - t' starts splitting
 - creating split references, opening daughers fails 
 - hfile links are split, creating hfile link daughter refs.  
 {{hfile\-region\-table.parentregion}}
 - these split hfile links are interpreted as hfile links with table 
 {{table.parentregion}} - 
 {{hfile\-region\-table.parentregion}}  (groupings interpreted 
 incorrectly)
 - Since this is after the splitting PONR, this aborts the server.  It then 
 spreads to the next server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7346) Restored snapshot replay problem

2012-12-13 Thread Jonathan Hsieh (JIRA)
Jonathan Hsieh created HBASE-7346:
-

 Summary: Restored snapshot replay problem
 Key: HBASE-7346
 URL: https://issues.apache.org/jira/browse/HBASE-7346
 Project: HBase
  Issue Type: Sub-task
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Priority: Critical
 Fix For: hbase-6055


The situation is a coarse-grained problem. The key problem is that writes that 
shouldn't be replayed (since they don't belong to the restored image), would 
not normally get replayed, but would potentially get replayed if recovery was 
triggered.
Previously, without restore, we could depend on the timestamps – if something 
was replayed but there was newer data, the newer data would win. In a restore 
situation, the newer data is has the old time stamps from before recovery, 
and new data that shouldn't get replayed could be.
ex: 
1) write 100 rows
2) ss1 (with logs)
3) write 50 rows
4) restore ss1
5) crash
6) writes from 1 and 3 both get replayed in log splitting recovery. Oops.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7290) Online snapshots

2012-12-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531137#comment-13531137
 ] 

Jonathan Hsieh commented on HBASE-7290:
---

Restored snapshot replay problem: HBASE-7346.  I've filed it as part of offline 
snapshots since it affects that case as well.

 Online snapshots 
 -

 Key: HBASE-7290
 URL: https://issues.apache.org/jira/browse/HBASE-7290
 Project: HBase
  Issue Type: Bug
Reporter: Jonathan Hsieh

 HBASE-6055 will be closed when the offline snapshots pieces get merged with 
 trunk.  This umbrella issue has all the online snapshot specific patches.  
 This will get merged once one of the implementations makes it into trunk.  
 Other flavors of online snapshots can then be done as normal patches instead 
 of on a development branch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7346) Restored snapshot replay problem

2012-12-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531139#comment-13531139
 ] 

Jonathan Hsieh commented on HBASE-7346:
---

It was suggested by [~mbertozzi] that we probably can adding a restore 
timestamp somewhere to a restored snapshot and have log splitting not restore 
data before that timestamp in that case.  Since we require a disable before 
restore, this may be able to avoid restore ragged edges due to having many rs's 
with different timestamps.

 Restored snapshot replay problem
 

 Key: HBASE-7346
 URL: https://issues.apache.org/jira/browse/HBASE-7346
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Priority: Critical
 Fix For: hbase-6055


 The situation is a coarse-grained problem. The key problem is that writes 
 that shouldn't be replayed (since they don't belong to the restored image), 
 would not normally get replayed, but would potentially get replayed if 
 recovery was triggered.
 Previously, without restore, we could depend on the timestamps – if something 
 was replayed but there was newer data, the newer data would win. In a restore 
 situation, the newer data is has the old time stamps from before recovery, 
 and new data that shouldn't get replayed could be.
 ex: 
 1) write 100 rows
 2) ss1 (with logs)
 3) write 50 rows
 4) restore ss1
 5) crash
 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531162#comment-13531162
 ] 

ramkrishna.s.vasudevan commented on HBASE-7342:
---

It tries to find the mid from the blockkey indices right...As the offset starts 
from 0 this should be fine right?


 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7119) org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test

2012-12-13 Thread Li Ping Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ping Zhang updated HBASE-7119:
-

Status: Patch Available  (was: In Progress)

submit patch

 org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test
 -

 Key: HBASE-7119
 URL: https://issues.apache.org/jira/browse/HBASE-7119
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.94.0, 0.92.0, 0.90.5, 0.90.4
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7119-0.94.patch, HBASE-7119-0.94-v1.patch, 
 HBASE-7119-trunk.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Running org.apache.hadoop.hbase.io.TestHeapSize
       Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.068 
 sec  FAILURE!
  testNativeSizes(org.apache.hadoop.hbase.io.TestHeapSize)  Time elapsed: 0.01 
 sec   FAILURE!
  junit.framework.AssertionFailedError: expected:64 but was:56
      at junit.framework.Assert.fail(Assert.java:47)
      at junit.framework.Assert.failNotEquals(Assert.java:283)
      at junit.framework.Assert.assertEquals(Assert.java:64)
      at junit.framework.Assert.assertEquals(Assert.java:130)
      at junit.framework.Assert.assertEquals(Assert.java:136)
      at 
 org.apache.hadoop.hbase.io.TestHeapSize.testNativeSizes(TestHeapSize.java:75)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test

2012-12-13 Thread Li Ping Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ping Zhang updated HBASE-7116:
-

Attachment: HBASE-7116-0.94.patch

This patch HBASE-7116-0.94.patch is for 0.94 branch.

 org.apache.hadoop.hbase.TestZooKeeper occasionally failed with 
 testClientSessionExpired unit test
 -

 Key: HBASE-7116
 URL: https://issues.apache.org/jira/browse/HBASE-7116
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0
 Environment: Linux RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with 
 testClientSessionExpired method as follwoing:
 -
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0
 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper)
 java.lang.AssertionError:
 at 
 org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114)
 impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate 
 configuration: tried hadoop-metrics2-datanode.properties, 
 hadoop-metrics2.properties
 2012-01-22 01:44:45,764 WARN  [main] util.MBeans(59): 
 Hadoop:service=DataNode,name=MetricsSystem,sub=Control
 javax.management.InstanceAlreadyExistsException: MXBean already registered 
 with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control
 2012-01-22 01:46:07,153 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(336): 
 org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected 
 from ZooKeeper, ignoring
 2012-01-22 01:46:07,157 WARN  
 [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] 
 datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, 
 storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, 
 ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException
 Shutting down DataNode 0
 2012-01-22 01:46:08,160 WARN  [main] util.MBeans(73): 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067
 javax.management.InstanceNotFoundException: 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test

2012-12-13 Thread Li Ping Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ping Zhang updated HBASE-7116:
-

Attachment: HBASE-7116-0.90.patch

The HBASE-7116-0.90.patch is for 0.90 branch.

 org.apache.hadoop.hbase.TestZooKeeper occasionally failed with 
 testClientSessionExpired unit test
 -

 Key: HBASE-7116
 URL: https://issues.apache.org/jira/browse/HBASE-7116
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0
 Environment: Linux RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with 
 testClientSessionExpired method as follwoing:
 -
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0
 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper)
 java.lang.AssertionError:
 at 
 org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114)
 impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate 
 configuration: tried hadoop-metrics2-datanode.properties, 
 hadoop-metrics2.properties
 2012-01-22 01:44:45,764 WARN  [main] util.MBeans(59): 
 Hadoop:service=DataNode,name=MetricsSystem,sub=Control
 javax.management.InstanceAlreadyExistsException: MXBean already registered 
 with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control
 2012-01-22 01:46:07,153 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(336): 
 org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected 
 from ZooKeeper, ignoring
 2012-01-22 01:46:07,157 WARN  
 [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] 
 datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, 
 storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, 
 ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException
 Shutting down DataNode 0
 2012-01-22 01:46:08,160 WARN  [main] util.MBeans(73): 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067
 javax.management.InstanceNotFoundException: 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test

2012-12-13 Thread Li Ping Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ping Zhang updated HBASE-7116:
-

Status: Patch Available  (was: Open)

 org.apache.hadoop.hbase.TestZooKeeper occasionally failed with 
 testClientSessionExpired unit test
 -

 Key: HBASE-7116
 URL: https://issues.apache.org/jira/browse/HBASE-7116
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.90.5
 Environment: Linux RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with 
 testClientSessionExpired method as follwoing:
 -
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0
 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper)
 java.lang.AssertionError:
 at 
 org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114)
 impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate 
 configuration: tried hadoop-metrics2-datanode.properties, 
 hadoop-metrics2.properties
 2012-01-22 01:44:45,764 WARN  [main] util.MBeans(59): 
 Hadoop:service=DataNode,name=MetricsSystem,sub=Control
 javax.management.InstanceAlreadyExistsException: MXBean already registered 
 with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control
 2012-01-22 01:46:07,153 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(336): 
 org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected 
 from ZooKeeper, ignoring
 2012-01-22 01:46:07,157 WARN  
 [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] 
 datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, 
 storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, 
 ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException
 Shutting down DataNode 0
 2012-01-22 01:46:08,160 WARN  [main] util.MBeans(73): 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067
 javax.management.InstanceNotFoundException: 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7119) org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531183#comment-13531183
 ] 

Hadoop QA commented on HBASE-7119:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12555307/HBASE-7119-0.94-v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3528//console

This message is automatically generated.

 org.apache.hadoop.hbase.io.TestHeapSize failed with testNativeSizes unit test
 -

 Key: HBASE-7119
 URL: https://issues.apache.org/jira/browse/HBASE-7119
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.90.4, 0.90.5, 0.92.0, 0.94.0
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7119-0.94.patch, HBASE-7119-0.94-v1.patch, 
 HBASE-7119-trunk.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Running org.apache.hadoop.hbase.io.TestHeapSize
       Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.068 
 sec  FAILURE!
  testNativeSizes(org.apache.hadoop.hbase.io.TestHeapSize)  Time elapsed: 0.01 
 sec   FAILURE!
  junit.framework.AssertionFailedError: expected:64 but was:56
      at junit.framework.Assert.fail(Assert.java:47)
      at junit.framework.Assert.failNotEquals(Assert.java:283)
      at junit.framework.Assert.assertEquals(Assert.java:64)
      at junit.framework.Assert.assertEquals(Assert.java:130)
      at junit.framework.Assert.assertEquals(Assert.java:136)
      at 
 org.apache.hadoop.hbase.io.TestHeapSize.testNativeSizes(TestHeapSize.java:75)
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test

2012-12-13 Thread Li Ping Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ping Zhang updated HBASE-7118:
-

Attachment: HBASE-7118-0.94.0.patch

Thanks Ramkrishna.s. I attached HBASE-7118-0.94.0.patch specially for release 
0.94.0.

 org.apache.hadoop.hbase.replication.TestReplicationPeer failed with 
 testResetZooKeeperSession unit test
 ---

 Key: HBASE-7118
 URL: https://issues.apache.org/jira/browse/HBASE-7118
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7118-0.94.0.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 org.apache.hadoop.hbase.replication.TestReplicationPeer
 Running org.apache.hadoop.hbase.replication.TestReplicationPeer
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec  
 FAILURE!
   --- stable failures, new for hbase 0.92.0, need to be fixed firstly.
   
  
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt
  output:
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec 
  FAILURE!
 testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer)
   Time elapsed: 25.247 sec   FAILURE!
 junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was 
 not properly expired.
 at junit.framework.Assert.fail(Assert.java:50)
 at 
 org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt
  content:

 2012-03-25 20:52:42,979 INFO  [main] zookeeper.MiniZooKeeperCluster(174): 
 Started MiniZK Cluster and connect 1 ZK server on client port: 21818
 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to 
 cluster: clusterId opening connection to ZooKeeper with ensemble 
 (localhost:21818)
 2012-03-25 20:52:43,082 INFO  [main] zookeeper.RecoverableZooKeeper(89): The 
 identifier of this process is 4...@svltest116.svl.ibm.com
 2012-03-25 20:52:43,166 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received 
 ZooKeeper Event, type=None, state=SyncConnected, path=null
 2012-03-25 20:52:43,175 INFO  [Thread-9] replication.TestReplicationPeer(53): 
 Expiring ReplicationPeer ZooKeeper session.
 2012-03-25 20:52:43,196 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(334): connection to cluster: 
 clusterId-0x1364d226a3d connected
 2012-03-25 20:52:43,308 INFO  [Thread-9] hbase.HBaseTestingUtility(1234): ZK 
 Closed Session 0x1364d226a3d; sleeping=25000
 2012-03-25 20:53:08,323 INFO  [Thread-9] replication.TestReplicationPeer(57): 
 Attempting to use expired ReplicationPeer ZooKeeper session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test

2012-12-13 Thread Li Ping Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Ping Zhang updated HBASE-7118:
-

Status: Patch Available  (was: Open)

 org.apache.hadoop.hbase.replication.TestReplicationPeer failed with 
 testResetZooKeeperSession unit test
 ---

 Key: HBASE-7118
 URL: https://issues.apache.org/jira/browse/HBASE-7118
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.92.0
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7118-0.94.0.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 org.apache.hadoop.hbase.replication.TestReplicationPeer
 Running org.apache.hadoop.hbase.replication.TestReplicationPeer
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec  
 FAILURE!
   --- stable failures, new for hbase 0.92.0, need to be fixed firstly.
   
  
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt
  output:
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec 
  FAILURE!
 testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer)
   Time elapsed: 25.247 sec   FAILURE!
 junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was 
 not properly expired.
 at junit.framework.Assert.fail(Assert.java:50)
 at 
 org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt
  content:

 2012-03-25 20:52:42,979 INFO  [main] zookeeper.MiniZooKeeperCluster(174): 
 Started MiniZK Cluster and connect 1 ZK server on client port: 21818
 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to 
 cluster: clusterId opening connection to ZooKeeper with ensemble 
 (localhost:21818)
 2012-03-25 20:52:43,082 INFO  [main] zookeeper.RecoverableZooKeeper(89): The 
 identifier of this process is 4...@svltest116.svl.ibm.com
 2012-03-25 20:52:43,166 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received 
 ZooKeeper Event, type=None, state=SyncConnected, path=null
 2012-03-25 20:52:43,175 INFO  [Thread-9] replication.TestReplicationPeer(53): 
 Expiring ReplicationPeer ZooKeeper session.
 2012-03-25 20:52:43,196 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(334): connection to cluster: 
 clusterId-0x1364d226a3d connected
 2012-03-25 20:52:43,308 INFO  [Thread-9] hbase.HBaseTestingUtility(1234): ZK 
 Closed Session 0x1364d226a3d; sleeping=25000
 2012-03-25 20:53:08,323 INFO  [Thread-9] replication.TestReplicationPeer(57): 
 Attempting to use expired ReplicationPeer ZooKeeper session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7340:
--

Attachment: 7340-v1.txt

Patch v1.
TestMasterObserver passes.

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7340:
--

  Assignee: Ted Yu
Issue Type: Task  (was: Bug)

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7340:
--

Status: Patch Available  (was: Open)

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531196#comment-13531196
 ] 

Hadoop QA commented on HBASE-7118:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12560807/HBASE-7118-0.94.0.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3529//console

This message is automatically generated.

 org.apache.hadoop.hbase.replication.TestReplicationPeer failed with 
 testResetZooKeeperSession unit test
 ---

 Key: HBASE-7118
 URL: https://issues.apache.org/jira/browse/HBASE-7118
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7118-0.94.0.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 org.apache.hadoop.hbase.replication.TestReplicationPeer
 Running org.apache.hadoop.hbase.replication.TestReplicationPeer
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec  
 FAILURE!
   --- stable failures, new for hbase 0.92.0, need to be fixed firstly.
   
  
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt
  output:
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec 
  FAILURE!
 testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer)
   Time elapsed: 25.247 sec   FAILURE!
 junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was 
 not properly expired.
 at junit.framework.Assert.fail(Assert.java:50)
 at 
 org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt
  content:

 2012-03-25 20:52:42,979 INFO  [main] zookeeper.MiniZooKeeperCluster(174): 
 Started MiniZK Cluster and connect 1 ZK server on client port: 21818
 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to 
 cluster: clusterId opening connection to ZooKeeper with ensemble 
 (localhost:21818)
 2012-03-25 20:52:43,082 INFO  [main] zookeeper.RecoverableZooKeeper(89): The 
 identifier of this process is 4...@svltest116.svl.ibm.com
 2012-03-25 20:52:43,166 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received 
 ZooKeeper Event, type=None, state=SyncConnected, path=null
 2012-03-25 20:52:43,175 INFO  [Thread-9] replication.TestReplicationPeer(53): 
 Expiring ReplicationPeer ZooKeeper session.
 2012-03-25 20:52:43,196 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(334): connection to cluster: 
 clusterId-0x1364d226a3d connected
 2012-03-25 20:52:43,308 INFO  [Thread-9] hbase.HBaseTestingUtility(1234): ZK 
 Closed Session 0x1364d226a3d; sleeping=25000
 2012-03-25 20:53:08,323 INFO  [Thread-9] replication.TestReplicationPeer(57): 
 Attempting to use expired ReplicationPeer ZooKeeper session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7116) org.apache.hadoop.hbase.TestZooKeeper occasionally failed with testClientSessionExpired unit test

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531197#comment-13531197
 ] 

Hadoop QA commented on HBASE-7116:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560805/HBASE-7116-0.90.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3530//console

This message is automatically generated.

 org.apache.hadoop.hbase.TestZooKeeper occasionally failed with 
 testClientSessionExpired unit test
 -

 Key: HBASE-7116
 URL: https://issues.apache.org/jira/browse/HBASE-7116
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5, 0.92.0
 Environment: Linux RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7116-0.90.patch, HBASE-7116-0.94.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 org.apache.hadoop.hbase.TestZooKeeper unit tests were failed with 
 testClientSessionExpired method as follwoing:
 -
 Tests run: 5, Failures: 1, Errors: 0, Skipped: 0
 testClientSessionExpired(org.apache.hadoop.hbase.TestZooKeeper)
 java.lang.AssertionError:
 at 
 org.apache.hadoop.hbase.TestZooKeeper.testClientSessionExpired(TestZooKeeper.java:114)
 impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate 
 configuration: tried hadoop-metrics2-datanode.properties, 
 hadoop-metrics2.properties
 2012-01-22 01:44:45,764 WARN  [main] util.MBeans(59): 
 Hadoop:service=DataNode,name=MetricsSystem,sub=Control
 javax.management.InstanceAlreadyExistsException: MXBean already registered 
 with name Hadoop:service=NameNode,name=MetricsSystem,sub=Control
 2012-01-22 01:46:07,153 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(336): 
 org.apache.hadoop.hbase.TestZooKeeper-0x135015e9f9e000d Received Disconnected 
 from ZooKeeper, ignoring
 2012-01-22 01:46:07,157 WARN  
 [org.apache.hadoop.hdfs.server.datanode.DataXceiverServer@55565556] 
 datanode.DataXceiverServer(138): DatanodeRegistration(127.0.0.1:41145, 
 storageID=DS-1858213200-9.123.196.159-41145-1327167899969, infoPort=36485, 
 ipcPort=55138):DataXceiveServer:java.nio.channels.AsynchronousCloseException
 Shutting down DataNode 0
 2012-01-22 01:46:08,160 WARN  [main] util.MBeans(73): 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067
 javax.management.InstanceNotFoundException: 
 Hadoop:service=DataNode,name=FSDatasetState-UndefinedStorageId-1051654067

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7120) hbase-daemon.sh (start) missing necessary check when writing pid and log files

2012-12-13 Thread Li Ping Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531212#comment-13531212
 ] 

Li Ping Zhang commented on HBASE-7120:
--

Stack, agree. I will modifiy it and regenerate new patches with your good 
suggestion.

 hbase-daemon.sh (start) missing necessary check when writing pid and log files
 --

 Key: HBASE-7120
 URL: https://issues.apache.org/jira/browse/HBASE-7120
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7120-0.94.patch, HBASE-7120-trunk.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 $HBASE_HOME/bin/hbase-daemon.sh exit code is Zero, when runing 
 hbase-daemon.sh failed with start, which doesn’t do required command exit 
 code check, it's better to do necessary check when writing pid and log files. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7211) Improve hbase ref guide for the testing part.

2012-12-13 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531228#comment-13531228
 ] 

Jeffrey Zhong commented on HBASE-7211:
--

Thanks Nicolas!






 Improve hbase ref guide for the testing part.
 -

 Key: HBASE-7211
 URL: https://issues.apache.org/jira/browse/HBASE-7211
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: hbase-7211-partial.patch


 Here is some stuff I saw. I will propose a fix in a week or so, please add 
 the comment or issues you have in mind.
 ??15.6.1. Apache HBase Modules??
 = We should be able to use categories in all modules. The default should be 
 small; but any test manipulating the time needs to be in a specific jvm 
 (hence medium), so it's not always related to minicluster.
 ??15.6.3.6. hbasetests.sh??
 = We can remove this chapter, and the script
  The script is not totally useless, but I think nobody actually uses it.
 = Add a chapter on flakiness.
 Some tests are, unfortunately, flaky. While there number decreases, we still 
 have some. Rules are:
 - don't write flaky tests! :-)
 - small tests cannot be flaky, as it blocks other test execution. Corollary: 
 if you have an issue with a small test, it's either your environment either a 
 severe issue.
 - rerun the test a few time to validate, check the ports and file descriptors 
 used.
 ??mvn test -P localTests -Dtest=MyTest??
 = We could actually activate the localTests profile whenever -Dtest is used. 
 If we do that, we can remove the reference from localTests in the doc.
 ??mvn test -P runSmallTests?? ??mvn test -P runMediumTests??
 = I'm not sure it's actually used. We could remove them from the pom.xml 
 (and the doc).
 ??The HBase build uses a patched version of the maven surefire plugin?? 
 = Hopefully, we will be able to remove this soon :-)
  ??Integration tests are described TODO: POINTER_TO_INTEGRATION_TEST_SECTION??
  = Should be documented

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange

2012-12-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7338:
---

Assignee: Himanshu Vashishtha

 Fix flaky condition for 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
 -

 Key: HBASE-7338
 URL: https://issues.apache.org/jira/browse/HBASE-7338
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3, 0.96.0
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Attachments: HBASE-7338.patch


 The balancer doesn't run in case a region is in-transition. The check to 
 confirm whether there all regions are assigned looks for region count  22, 
 where the total regions are 27. This may result in a failure:
 {code}
 java.lang.AssertionError: After 5 attempts, region assignments were not 
 balanced.
   at org.junit.Assert.fail(Assert.java:93)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123)
 .
 2012-12-11 13:47:02,231 INFO  [pool-1-thread-1] 
 hbase.TestRegionRebalancing(120): Added fourth 
 server=p0118.mtv.cloudera.com,44414,1355262422083
 2012-12-11 13:47:02,231 INFO  
 [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] 
 regionserver.HRegionServer(3769): Registered RegionServer MXBean
 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d.
  state=OPENING, ts=1355262421037, 
 server=p0118.mtv.cloudera.com,54281,1355262419765}
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load 
 Average: 13.0 low border: 9, up border: 16; attempt: 0
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 
 Avg: 13.0 actual: 11
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 
 Avg: 13.0 actual: 15
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 
 Avg: 13.0 actual: 0
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 
 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2
 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange

2012-12-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531241#comment-13531241
 ] 

Jimmy Xiang commented on HBASE-7338:


+1. Looks good. On commit, need to add a timeout for the test in case region 
stuck in transition.

 Fix flaky condition for 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
 -

 Key: HBASE-7338
 URL: https://issues.apache.org/jira/browse/HBASE-7338
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3, 0.96.0
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Attachments: HBASE-7338.patch


 The balancer doesn't run in case a region is in-transition. The check to 
 confirm whether there all regions are assigned looks for region count  22, 
 where the total regions are 27. This may result in a failure:
 {code}
 java.lang.AssertionError: After 5 attempts, region assignments were not 
 balanced.
   at org.junit.Assert.fail(Assert.java:93)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123)
 .
 2012-12-11 13:47:02,231 INFO  [pool-1-thread-1] 
 hbase.TestRegionRebalancing(120): Added fourth 
 server=p0118.mtv.cloudera.com,44414,1355262422083
 2012-12-11 13:47:02,231 INFO  
 [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] 
 regionserver.HRegionServer(3769): Registered RegionServer MXBean
 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d.
  state=OPENING, ts=1355262421037, 
 server=p0118.mtv.cloudera.com,54281,1355262419765}
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load 
 Average: 13.0 low border: 9, up border: 16; attempt: 0
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 
 Avg: 13.0 actual: 11
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 
 Avg: 13.0 actual: 15
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 
 Avg: 13.0 actual: 0
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 
 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2
 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6651) Improve thread safety of HTablePool

2012-12-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531242#comment-13531242
 ] 

Lars Hofhansl commented on HBASE-6651:
--

-1 on keeping it. It seems need an official vote.

 Improve thread safety of HTablePool
 ---

 Key: HBASE-6651
 URL: https://issues.apache.org/jira/browse/HBASE-6651
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.94.1
Reporter: Hiroshi Ikeda
Assignee: Hiroshi Ikeda
 Fix For: 0.96.0

 Attachments: HBASE-6651.patch, HBASE-6651-V10.patch, 
 HBASE-6651-V2.patch, HBASE-6651-V3.patch, HBASE-6651-V4.patch, 
 HBASE-6651-V5.patch, HBASE-6651-V6.patch, HBASE-6651-V7.patch, 
 HBASE-6651-V8.patch, HBASE-6651-V9.patch, sample.zip, sample.zip, 
 sharedmap_for_hbaseclient.zip


 There are some operations in HTablePool accessing PoolMap in multiple places 
 without any explicit synchronization. 
 For example HTablePool.closeTablePool() calls PoolMap.values(), and calls 
 PoolMap.remove(). If other threads add new instances to the pool in the 
 middle of the calls, the newly added instances might be dropped. 
 (HTablePool.closeTablePool() also has another problem that calling it by 
 multiple threads causes accessing HTable by multiple threads.)
 Moreover, PoolMap is not thread safe for the same reason.
 For example PoolMap.put() calles ConcurrentMap.get() and calles 
 ConcurrentMap.put(). If other threads add a new instance to the concurent map 
 in the middle of the calls, the new instance might be dropped.
 And also implementations of Pool have the same problems.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7118) org.apache.hadoop.hbase.replication.TestReplicationPeer failed with testResetZooKeeperSession unit test

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531244#comment-13531244
 ] 

Ted Yu commented on HBASE-7118:
---

@Liping:
Can you tell us which JVM you were using ?
Looking at recent Jenkins builds for 0.94, I don't see this test failing.

Thanks

 org.apache.hadoop.hbase.replication.TestReplicationPeer failed with 
 testResetZooKeeperSession unit test
 ---

 Key: HBASE-7118
 URL: https://issues.apache.org/jira/browse/HBASE-7118
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0, 0.94.0
 Environment: RHEL 5.3, open JDK 1.6
Reporter: Li Ping Zhang
Assignee: Li Ping Zhang
  Labels: patch
 Attachments: HBASE-7118-0.94.0.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 org.apache.hadoop.hbase.replication.TestReplicationPeer
 Running org.apache.hadoop.hbase.replication.TestReplicationPeer
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 25.89 sec  
 FAILURE!
   --- stable failures, new for hbase 0.92.0, need to be fixed firstly.
   
  
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer.txt
  output:
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 28.245 sec 
  FAILURE!
 testResetZooKeeperSession(org.apache.hadoop.hbase.replication.TestReplicationPeer)
   Time elapsed: 25.247 sec   FAILURE!
 junit.framework.AssertionFailedError: ReplicationPeer ZooKeeper session was 
 not properly expired.
 at junit.framework.Assert.fail(Assert.java:50)
 at 
 org.apache.hadoop.hbase.replication.TestReplicationPeer.testResetZooKeeperSession(TestReplicationPeer.java:73)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
 at 
 org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
 org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
 at 
 org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
 org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
 
 target/surefire-reports/org.apache.hadoop.hbase.replication.TestReplicationPeer-output.txt
  content:

 2012-03-25 20:52:42,979 INFO  [main] zookeeper.MiniZooKeeperCluster(174): 
 Started MiniZK Cluster and connect 1 ZK server on client port: 21818
 2012-03-25 20:52:43,023 DEBUG [main] zookeeper.ZKUtil(96): connection to 
 cluster: clusterId opening connection to ZooKeeper with ensemble 
 (localhost:21818)
 2012-03-25 20:52:43,082 INFO  [main] zookeeper.RecoverableZooKeeper(89): The 
 identifier of this process is 4...@svltest116.svl.ibm.com
 2012-03-25 20:52:43,166 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(257): connection to cluster: clusterId Received 
 ZooKeeper Event, type=None, state=SyncConnected, path=null
 2012-03-25 20:52:43,175 INFO  [Thread-9] replication.TestReplicationPeer(53): 
 Expiring ReplicationPeer ZooKeeper session.
 2012-03-25 20:52:43,196 DEBUG [main-EventThread] 
 zookeeper.ZooKeeperWatcher(334): connection to cluster: 
 clusterId-0x1364d226a3d connected
 2012-03-25 20:52:43,308 INFO  [Thread-9] hbase.HBaseTestingUtility(1234): ZK 
 Closed Session 0x1364d226a3d; sleeping=25000
 2012-03-25 20:53:08,323 INFO  [Thread-9] replication.TestReplicationPeer(57): 
 Attempting to use expired ReplicationPeer ZooKeeper session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531247#comment-13531247
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

Hi Ramkrishna,

The logic for the change is as follows:

With the existing implementation (using -1), when there are two items in the 
array, it returns the 0th item ( (2 - 1) / 2 = 0 ) , which is equal the index 
of the firstKey. This is a problem during splits because a split is invalid if 
the midkey is equal to the firstKey. What we really want here is the index to 
be 1. This is because the lastKey is going to be first key in the next block. 
So there won't be a collision with it and the midkey will really represent the 
mid of first and last.

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7343) Fix flaky condition for TestDrainingServer

2012-12-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531246#comment-13531246
 ] 

Jimmy Xiang commented on HBASE-7343:


You need to rebase your branch and post a new patch so that hadoopqa can take 
it.  Can you make sure all tests in this class have proper timeout?

+1. Looks good to me. Will commit if hadoopqa looks good.

 Fix flaky condition for TestDrainingServer
 --

 Key: HBASE-7343
 URL: https://issues.apache.org/jira/browse/HBASE-7343
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Attachments: HBASE-7343.patch


 The assert statement in setUpBeforeClass() may fail in case the region 
 distribution is not even (a particular rs has 0 regions).
 {code}
 junit.framework.AssertionFailedError
   at junit.framework.Assert.fail(Assert.java:48)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertFalse(Assert.java:34)
   at junit.framework.Assert.assertFalse(Assert.java:41)
   at 
 org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83)
 {code}
 This is already fixed in trunk with HBASE-5992, but as that's a bigger change 
 and uses 5877, this jira fixes that issue instead of backporting 5992.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7347) Allow multiple readers per storefile

2012-12-13 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-7347:


 Summary: Allow multiple readers per storefile
 Key: HBASE-7347
 URL: https://issues.apache.org/jira/browse/HBASE-7347
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


Currently each store file is read only through the single reader regardless of 
how many concurrent read requests access that file.

This issue is to explore alternate designs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads

2012-12-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7336:
-

Issue Type: Sub-task  (was: Bug)
Parent: HBASE-7347

 HFileBlock.readAtOffset does not work well with multiple threads
 

 Key: HBASE-7336
 URL: https://issues.apache.org/jira/browse/HBASE-7336
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7336-0.94.txt, 7336-0.96.txt


 HBase grinds to a halt when many threads scan along the same set of blocks 
 and neither read short circuit is nor block caching is enabled for the dfs 
 client ... disabling the block cache makes sense on very large scans.
 It turns out that synchronizing in istream in HFileBlock.readAtOffset is the 
 culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531254#comment-13531254
 ] 

Ted Yu commented on HBASE-7342:
---

Since the fix is for array size being 2, maybe add a check for this case and 
don't subtract 1. Otherwise keep the current logic.

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7343) Fix flaky condition for TestDrainingServer

2012-12-13 Thread Himanshu Vashishtha (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Himanshu Vashishtha updated HBASE-7343:
---

Fix Version/s: 0.94.4

 Fix flaky condition for TestDrainingServer
 --

 Key: HBASE-7343
 URL: https://issues.apache.org/jira/browse/HBASE-7343
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Fix For: 0.94.4

 Attachments: HBASE-7343.patch


 The assert statement in setUpBeforeClass() may fail in case the region 
 distribution is not even (a particular rs has 0 regions).
 {code}
 junit.framework.AssertionFailedError
   at junit.framework.Assert.fail(Assert.java:48)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertFalse(Assert.java:34)
   at junit.framework.Assert.assertFalse(Assert.java:41)
   at 
 org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83)
 {code}
 This is already fixed in trunk with HBASE-5992, but as that's a bigger change 
 and uses 5877, this jira fixes that issue instead of backporting 5992.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7343) Fix flaky condition for TestDrainingServer

2012-12-13 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531264#comment-13531264
 ] 

Himanshu Vashishtha commented on HBASE-7343:


This is only for 0.94.4 as trunk is already taken care by 5992

 Fix flaky condition for TestDrainingServer
 --

 Key: HBASE-7343
 URL: https://issues.apache.org/jira/browse/HBASE-7343
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Fix For: 0.94.4

 Attachments: HBASE-7343.patch


 The assert statement in setUpBeforeClass() may fail in case the region 
 distribution is not even (a particular rs has 0 regions).
 {code}
 junit.framework.AssertionFailedError
   at junit.framework.Assert.fail(Assert.java:48)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertFalse(Assert.java:34)
   at junit.framework.Assert.assertFalse(Assert.java:41)
   at 
 org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83)
 {code}
 This is already fixed in trunk with HBASE-5992, but as that's a bigger change 
 and uses 5877, this jira fixes that issue instead of backporting 5992.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531265#comment-13531265
 ] 

stack commented on HBASE-7336:
--

+1 on committing this for now.

On Reader per long-running scanner, it will complicate the swapping in of new 
files on compaction but probably worth figuring out.  Lets file issues for 
further improvement.

Good stuff Lars.

 HFileBlock.readAtOffset does not work well with multiple threads
 

 Key: HBASE-7336
 URL: https://issues.apache.org/jira/browse/HBASE-7336
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7336-0.94.txt, 7336-0.96.txt


 HBase grinds to a halt when many threads scan along the same set of blocks 
 and neither read short circuit is nor block caching is enabled for the dfs 
 client ... disabling the block cache makes sense on very large scans.
 It turns out that synchronizing in istream in HFileBlock.readAtOffset is the 
 culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531272#comment-13531272
 ] 

Ted Yu commented on HBASE-7342:
---

I extracted testBasicSplit from patch v2 and it passed.
{code}
Running org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
2012-12-13 10:17:06.866 java[69043:1903] Unable to load realm mapping info from 
SCDynamicStore
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.711 sec
{code}
@Aleksandr:
Can you refine your test case to show us the problem ?

Thanks

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531275#comment-13531275
 ] 

stack commented on HBASE-7342:
--

+1 on fix.  I'd not add the test.  Its overkill running a cluster to test a 
array math problem. 

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531282#comment-13531282
 ] 

Hadoop QA commented on HBASE-7340:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560808/7340-v1.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
104 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 23 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestMultiParallel

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3531//console

This message is automatically generated.

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531284#comment-13531284
 ] 

stack commented on HBASE-7342:
--

[~ted_yu] The problem is plain.  Look at it.  Imagine an array w/ two elements 
in it only.

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7317) server-side request problems are hard to debug

2012-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531286#comment-13531286
 ] 

Sergey Shelukhin commented on HBASE-7317:
-

How do you define progress on tracing? Improvements on the library, or adoption 
within HBase codebase?
Also, w.r.t. Apache, as far as I see to go into commons sandbox you don't need 
much of a separate project.

 server-side request problems are hard to debug
 --

 Key: HBASE-7317
 URL: https://issues.apache.org/jira/browse/HBASE-7317
 Project: HBase
  Issue Type: Brainstorming
  Components: IPC/RPC, regionserver
Reporter: Sergey Shelukhin
Priority: Minor

 I've seen cases during integration tests where the write or read request took 
 an unexpectedly large amount of time (that, after the client went to the 
 region server that is reported alive and well, which I know from temporary 
 debug logging :)), and it's impossible to understand what is going on on the 
 server side, short of catching the moment with jstack.
 Some solutions (off by default) could be 
 - a facility for tests (especially integration tests) that would trace 
 Server/Master calls into some log or file (won't help with internals but at 
 least one could see what was actually received);
 - logging the progress of requests between components inside master/server 
 (e.g. request id=N received, request id=N is being processed in MyClass, 
 N being drawn on client from local sequence - no guarantees of uniqueness are 
 necessary).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6466:


Component/s: regionserver

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6466:


Affects Version/s: 0.96.0

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6466:


Fix Version/s: 0.96.0

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531292#comment-13531292
 ] 

stack commented on HBASE-6466:
--

Lets get some more review on this patch before it goes in.  Its no fun 
debugging intermittent hung flushing or closing on a distributed cluster.

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6466) Enable multi-thread for memstore flush

2012-12-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6466:
-

Priority: Critical  (was: Major)

Marking critical so it does not slip through the cracks.  It is patch available 
and just in need of additional review.

 Enable multi-thread for memstore flush
 --

 Key: HBASE-6466
 URL: https://issues.apache.org/jira/browse/HBASE-6466
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-6466.patch, HBASE-6466v2.patch, 
 HBASE-6466v3.1.patch, HBASE-6466v3.patch, HBASE-6466-v4.patch


 If the KV is large or Hlog is closed with high-pressure putting, we found 
 memstore is often above the high water mark and block the putting.
 So should we enable multi-thread for Memstore Flush?
 Some performance test data for reference,
 1.test environment : 
 random writting;upper memstore limit 5.6GB;lower memstore limit 4.8GB;400 
 regions per regionserver;row len=50 bytes, value len=1024 bytes;5 
 regionserver, 300 ipc handler per regionserver;5 client, 50 thread handler 
 per client for writing
 2.test results:
 one cacheFlush handler, tps: 7.8k/s per regionserver, Flush:10.1MB/s per 
 regionserver, appears many aboveGlobalMemstoreLimit blocking
 two cacheFlush handlers, tps: 10.7k/s per regionserver, Flush:12.46MB/s per 
 regionserver,
 200 thread handler per client  two cacheFlush handlers, tps:16.1k/s per 
 regionserver, Flush:18.6MB/s per regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531296#comment-13531296
 ] 

Nick Dimiduk commented on HBASE-7340:
-

Is this a case for shipping some pre-made coprocessors, such as an 
AutoCompactionObserver?

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531298#comment-13531298
 ] 

Ted Yu commented on HBASE-7340:
---

Test failure in TestMultiParallel is unrelated to the patch.

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread Aleksandr Shulman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531300#comment-13531300
 ] 

Aleksandr Shulman commented on HBASE-7342:
--

I'll go back and refine the test case. We can add it later. The hard part is 
getting blockKeys to be a size-2 array. Any suggestions?

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2012-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531302#comment-13531302
 ] 

Sergey Shelukhin commented on HBASE-7268:
-

bq. How would you use it Sergey? Would it be ok if on cluster restart, the 
sequence restarted at zero?
The single-source, increasing timer would be useful for any coordination 
tasks... e.g. you'd always know which events happened earlier, across master 
restarts/etc. It should only reset when there's a singularity e.g. if you wipe 
the cluster.
It's overkill to do it just for this issue though...
I think I saw it discussed somewhere, maybe in a JIRA related to snapshots.

bq. Yes. But we can't have client register to get callbacks when regions moves. 
What you thinking?
First sleep then get location? :)

 correct local region location cache information can be overwritten w/stale 
 information from an old server
 -

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7317) server-side request problems are hard to debug

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531305#comment-13531305
 ] 

stack commented on HBASE-7317:
--

bq. How do you define progress on tracing?

I'd define it as us making use of the lib.  Examples:

+ Flip a switch in shell and it 'enables tracing'.  Make a request and when the 
request completes, the trace info is dumped out in shell
+ Flip a switch on a regionserver and it will start dumping trace into the .log 
file or into the UI

 server-side request problems are hard to debug
 --

 Key: HBASE-7317
 URL: https://issues.apache.org/jira/browse/HBASE-7317
 Project: HBase
  Issue Type: Brainstorming
  Components: IPC/RPC, regionserver
Reporter: Sergey Shelukhin
Priority: Minor

 I've seen cases during integration tests where the write or read request took 
 an unexpectedly large amount of time (that, after the client went to the 
 region server that is reported alive and well, which I know from temporary 
 debug logging :)), and it's impossible to understand what is going on on the 
 server side, short of catching the moment with jstack.
 Some solutions (off by default) could be 
 - a facility for tests (especially integration tests) that would trace 
 Server/Master calls into some log or file (won't help with internals but at 
 least one could see what was actually received);
 - logging the progress of requests between components inside master/server 
 (e.g. request id=N received, request id=N is being processed in MyClass, 
 N being drawn on client from local sequence - no guarantees of uniqueness are 
 necessary).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7342) Split operation without split key incorrectly finds the middle key in off-by-one error

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531309#comment-13531309
 ] 

stack commented on HBASE-7342:
--

bq. I'll go back and refine the test case. We can add it later. The hard part 
is getting blockKeys to be a size-2 array. Any suggestions?

You could just make a test apart from splits.  Conjure an array of 0, 1, 2, and 
10 elements.  Verify that when you divide by two you get a 'midpoint' that 
makes sense for hbase. 

But such a test verges on the silly I would argue.  Lets just commit your fix 
unless someone comes up w/ a reason for why the -1 was there in the first place.

 Split operation without split key incorrectly finds the middle key in 
 off-by-one error
 --

 Key: HBASE-7342
 URL: https://issues.apache.org/jira/browse/HBASE-7342
 Project: HBase
  Issue Type: Bug
  Components: HFile, io
Affects Versions: 0.94.1, 0.94.2, 0.94.3, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.0, 0.94.4

 Attachments: HBASE-7342-v1.patch, HBASE-7342-v2.patch


 I took a deeper look into issues I was having using region splitting when 
 specifying a region (but not a key for splitting).
 The midkey calculation is off by one and when there are 2 rows, will pick the 
 0th one. This causes the firstkey to be the same as midkey and the split will 
 fail. Removing the -1 causes it work correctly, as per the test I've added.
 Looking into the code here is what goes on:
 1. Split takes the largest storefile
 2. It puts all the keys into a 2-dimensional array called blockKeys[][]. Key 
 i resides as blockKeys[i]
 3. Getting the middle root-level index should yield the key in the middle of 
 the storefile
 4. In step 3, we see that there is a possible erroneous (-1) to adjust for 
 the 0-offset indexing.
 5. In a result with where there are only 2 blockKeys, this yields the 0th 
 block key. 
 6. Unfortunately, this is the same block key that 'firstKey' will be.
 7. This yields the result in HStore.java:1873 (cannot split because midkey 
 is the same as first or last row)
 8. Removing the -1 solves the problem (in this case). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531317#comment-13531317
 ] 

stack commented on HBASE-7340:
--

I read the thread.  How does it justify this feature?  Nor does this feature 
strike me as a 'major' priority.

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Allow user-specified actions following region movement

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531318#comment-13531318
 ] 

Ted Yu commented on HBASE-7340:
---

@Nick:
I plan to put AutoCompactionObserver in another issue.
This issue exposes region movement to custom coprocessor logic. However, the 
notification is from master side and decision for compaction should be made on 
region servers. There is more work so that we can come up with good design for 
auto compaction.

If my patch passes review, I plan to modify the subject of this JIRA and open 
new one to continue with auto compaction.

 Allow user-specified actions following region movement
 --

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7268) correct local region location cache information can be overwritten w/stale information from an old server

2012-12-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531323#comment-13531323
 ] 

nkeywal commented on HBASE-7268:


Something that could be done here is to give back to the client the time when 
the move started: then the client could have another set of heuristic like: the 
region move started less than 10 seconds ago, let's wait a little before asking 
for the location.

 correct local region location cache information can be overwritten w/stale 
 information from an old server
 -

 Key: HBASE-7268
 URL: https://issues.apache.org/jira/browse/HBASE-7268
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7268-v0.patch, HBASE-7268-v0.patch, 
 HBASE-7268-v1.patch, HBASE-7268-v2.patch


 Discovered via HBASE-7250; related to HBASE-5877.
 Test is writing from multiple threads.
 Server A has region R; client knows that.
 R gets moved from A to server B.
 B gets killed.
 R gets moved by master to server C.
 ~15 seconds later, client tries to write to it (on A?).
 Multiple client threads report from RegionMoved exception processing logic R 
 moved from C to B, even though such transition never happened (neither in 
 nor before the sequence described below). Not quite sure how the client 
 learned of the transition to C, I assume it's from meta from some other 
 thread...
 Then, put fails (it may fail due to accumulated errors that are not logged, 
 which I am investigating... but the bogus cache update is there 
 nonwithstanding).
 I have a patch but not sure if it works, test still fails locally for yet 
 unknown reason.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)

2012-12-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4791:
-

Attachment: HBASE-4791-v4.patch

Reapplying Matteo's patch so hadoopqa finds the trunk version

 Allow Secure Zookeeper JAAS configuration to be programmatically set (rather 
 than only by reading JAAS configuration file)
 --

 Key: HBASE-4791
 URL: https://issues.apache.org/jira/browse/HBASE-4791
 Project: HBase
  Issue Type: Improvement
  Components: security, Zookeeper
Reporter: Eugene Koontz
Assignee: Matteo Bertozzi
  Labels: security, zookeeper
 Attachments: DemoConfig.java, HBASE-4791-v1.patch, 
 HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, 
 HBASE-4791-v4.patch, HBASE-4791-v4.patch


 In the currently proposed fix for HBASE-2418, there must be a JAAS file 
 specified in System.setProperty(java.security.auth.login.config). 
 However, it might be preferable to construct a JAAS configuration 
 programmatically, as is done with secure Hadoop (see 
 https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175).
 This would have the benefit of avoiding a usage of a system property setting, 
 and allow instead an HBase-local configuration setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)

2012-12-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4791:
-

Status: Open  (was: Patch Available)

 Allow Secure Zookeeper JAAS configuration to be programmatically set (rather 
 than only by reading JAAS configuration file)
 --

 Key: HBASE-4791
 URL: https://issues.apache.org/jira/browse/HBASE-4791
 Project: HBase
  Issue Type: Improvement
  Components: security, Zookeeper
Reporter: Eugene Koontz
Assignee: Matteo Bertozzi
  Labels: security, zookeeper
 Attachments: DemoConfig.java, HBASE-4791-v1.patch, 
 HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, 
 HBASE-4791-v4.patch, HBASE-4791-v4.patch


 In the currently proposed fix for HBASE-2418, there must be a JAAS file 
 specified in System.setProperty(java.security.auth.login.config). 
 However, it might be preferable to construct a JAAS configuration 
 programmatically, as is done with secure Hadoop (see 
 https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175).
 This would have the benefit of avoiding a usage of a system property setting, 
 and allow instead an HBase-local configuration setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)

2012-12-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4791:
-

Status: Patch Available  (was: Open)

 Allow Secure Zookeeper JAAS configuration to be programmatically set (rather 
 than only by reading JAAS configuration file)
 --

 Key: HBASE-4791
 URL: https://issues.apache.org/jira/browse/HBASE-4791
 Project: HBase
  Issue Type: Improvement
  Components: security, Zookeeper
Reporter: Eugene Koontz
Assignee: Matteo Bertozzi
  Labels: security, zookeeper
 Attachments: DemoConfig.java, HBASE-4791-v1.patch, 
 HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, 
 HBASE-4791-v4.patch, HBASE-4791-v4.patch


 In the currently proposed fix for HBASE-2418, there must be a JAAS file 
 specified in System.setProperty(java.security.auth.login.config). 
 However, it might be preferable to construct a JAAS configuration 
 programmatically, as is done with secure Hadoop (see 
 https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175).
 This would have the benefit of avoiding a usage of a system property setting, 
 and allow instead an HBase-local configuration setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7340:
--

Description: 
I found this issue when reading user discussion which is quoted below.

In HMaster.moveRegion(), we have:
{code}
  this.assignmentManager.balance(rp);
  if (this.cpHost != null) {
this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
  }
{code}
Meaning, user can register master coprocessor which would receive region 
movement notification.

The assignmentManager.balance(plan) call in HMaster.balance() doesn't send out 
such notification.
I think we should enhance the following hook (at line 1335) with list of 
regions moved so that notification from master is consistent:
{code}
this.cpHost.postBalance();
{code}

Here is excerpt for user discussion:

Sometimes user performs compaction after a region is moved (by balancer). We 
should provide 'hook' which lets user specify what follow-on actions to take 
after region movement.

See discussion on user mailing list under the thread 'How to know it's time for 
a major compaction?' for background information: 
http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

  was:
Sometimes user performs compaction after a region is moved (by balancer). We 
should provide 'hook' which lets user specify what follow-on actions to take 
after region movement.

See discussion on user mailing list under the thread 'How to know it's time for 
a major compaction?' for background information: 
http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

 Issue Type: Bug  (was: Task)
Summary: Master coprocessor notification for 
assignmentManager.balance() is inconsistent  (was: Allow user-specified actions 
following region movement)

@Stack:
As mentioned above, I have modified the subject of this JIRA to reflect latest 
progress.

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7346) Restored snapshot replay problem

2012-12-13 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531349#comment-13531349
 ] 

Jesse Yates commented on HBASE-7346:


When we do restore a table, it needds be disabled first (no inflight 
restores!), which flushes everything to disk and rolls the WAL. This seems as 
simple as just removing the existing recovered.edits directory under a table 
before we restore the table.

As an aside, offline snapshots don't actually reference any of the WALs and 
just copy the recovered.edits directory over, just incase something is in 
there. This should only apply to the online snapshot cases.

 Restored snapshot replay problem
 

 Key: HBASE-7346
 URL: https://issues.apache.org/jira/browse/HBASE-7346
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Priority: Critical
 Fix For: hbase-6055


 The situation is a coarse-grained problem. The key problem is that writes 
 that shouldn't be replayed (since they don't belong to the restored image), 
 would not normally get replayed, but would potentially get replayed if 
 recovery was triggered.
 Previously, without restore, we could depend on the timestamps – if something 
 was replayed but there was newer data, the newer data would win. In a restore 
 situation, the newer data is has the old time stamps from before recovery, 
 and new data that shouldn't get replayed could be.
 ex: 
 1) write 100 rows
 2) ss1 (with logs)
 3) write 50 rows
 4) restore ss1
 5) crash
 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7347) Allow multiple readers per storefile

2012-12-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531364#comment-13531364
 ] 

Lars Hofhansl commented on HBASE-7347:
--

Options might include:
* Ad hoc fix in HBASE-7336 is good enough (using seek + read when possible 
without contention, false back to p-read when needed)
* A separate reader for compactions (Stack suggested that)
* A reader for each scanner that has cacheBlocks set to false (presumably 
because it is expected to be large scan, that would also cover compactions)
* multiple readers for store files, round robin between them
** based on size (i.e. a reader per each n GB)
** based on access pattern (i.e. create readers, cache them for a bit, expire 
them)

What probably does not work:
* Use p-read everywhere (does not perform well)
* A reader per scanner (probably not efficient, open is a name node operation)



 Allow multiple readers per storefile
 

 Key: HBASE-7347
 URL: https://issues.apache.org/jira/browse/HBASE-7347
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl

 Currently each store file is read only through the single reader regardless 
 of how many concurrent read requests access that file.
 This issue is to explore alternate designs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7205) Coprocessor classloader is replicated for all regions in the HRegionServer

2012-12-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531366#comment-13531366
 ] 

Andrew Purtell commented on HBASE-7205:
---

Thanks [~amuraru] for pitching in on this. Thanks Ted for the reviews.

 Coprocessor classloader is replicated for all regions in the HRegionServer
 --

 Key: HBASE-7205
 URL: https://issues.apache.org/jira/browse/HBASE-7205
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Affects Versions: 0.92.2, 0.94.2
Reporter: Adrian Muraru
Assignee: Ted Yu
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7205-0.94.txt, 7205-v10.txt, 7205-v1.txt, 7205-v3.txt, 
 7205-v4.txt, 7205-v5.txt, 7205-v6.txt, 7205-v7.txt, 7205-v8.txt, 7205-v9.txt, 
 HBASE-7205_v2.patch


 HBASE-6308 introduced a new custom CoprocessorClassLoader to load the 
 coprocessor classes and a new instance of this CL is created for each single 
 HRegion opened. This leads to OOME-PermGen when the number of regions go 
 above hundres / region server. 
 Having the table coprocessor jailed in a separate classloader is good however 
 we should create only one for all regions of a table in each HRS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7348) [89-fb] Add some statistics from DFSClient to RegionServerMetrics

2012-12-13 Thread Liyin Tang (JIRA)
Liyin Tang created HBASE-7348:
-

 Summary: [89-fb] Add some statistics from DFSClient to 
RegionServerMetrics
 Key: HBASE-7348
 URL: https://issues.apache.org/jira/browse/HBASE-7348
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang


DFSClient actually collected a number of useful statistics such as 
bytesLocalRead, bytesLocalRackRead and so on. So this diff is going to merge 
these metrics into the RegionServerMetrics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531367#comment-13531367
 ] 

Andrew Purtell commented on HBASE-7340:
---

[~saint@gmail.com] I think this is some minor missing coverage in the CP 
API, minor because there isn't an existing user (yet) for it, so it's 
reasonable on those grounds. Agree the priority should be minor if not trivial.

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 7340-v1.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-7340:
--

  Priority: Trivial  (was: Major)
Issue Type: Improvement  (was: Bug)

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Attachments: 7340-v1.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads

2012-12-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7336:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the reviews.

 HFileBlock.readAtOffset does not work well with multiple threads
 

 Key: HBASE-7336
 URL: https://issues.apache.org/jira/browse/HBASE-7336
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7336-0.94.txt, 7336-0.96.txt


 HBase grinds to a halt when many threads scan along the same set of blocks 
 and neither read short circuit is nor block caching is enabled for the dfs 
 client ... disabling the block cache makes sense on very large scans.
 It turns out that synchronizing in istream in HFileBlock.readAtOffset is the 
 culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange

2012-12-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7338:
---

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 Fix flaky condition for 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
 -

 Key: HBASE-7338
 URL: https://issues.apache.org/jira/browse/HBASE-7338
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3, 0.96.0
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7338.patch


 The balancer doesn't run in case a region is in-transition. The check to 
 confirm whether there all regions are assigned looks for region count  22, 
 where the total regions are 27. This may result in a failure:
 {code}
 java.lang.AssertionError: After 5 attempts, region assignments were not 
 balanced.
   at org.junit.Assert.fail(Assert.java:93)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123)
 .
 2012-12-11 13:47:02,231 INFO  [pool-1-thread-1] 
 hbase.TestRegionRebalancing(120): Added fourth 
 server=p0118.mtv.cloudera.com,44414,1355262422083
 2012-12-11 13:47:02,231 INFO  
 [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] 
 regionserver.HRegionServer(3769): Registered RegionServer MXBean
 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d.
  state=OPENING, ts=1355262421037, 
 server=p0118.mtv.cloudera.com,54281,1355262419765}
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load 
 Average: 13.0 low border: 9, up border: 16; attempt: 0
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 
 Avg: 13.0 actual: 11
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 
 Avg: 13.0 actual: 15
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 
 Avg: 13.0 actual: 0
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 
 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2
 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7340:
--

Attachment: 7340-v2.txt

Patch v2 addresses Andy's comments.

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Attachments: 7340-v1.txt, 7340-v2.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7341) Deprecate RowLocks in 0.94

2012-12-13 Thread Gregory Chanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-7341:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the review, Stack.

Committed to 0.94.

 Deprecate RowLocks in 0.94
 --

 Key: HBASE-7341
 URL: https://issues.apache.org/jira/browse/HBASE-7341
 Project: HBase
  Issue Type: Task
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.94.4

 Attachments: HBASE-7341.patch


 Since we are removing support in 0.96 (see HBASE-7315), we should deprecate 
 in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531394#comment-13531394
 ] 

Andrew Purtell commented on HBASE-7340:
---

+1 patch v2

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Attachments: 7340-v1.txt, 7340-v2.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata

2012-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531396#comment-13531396
 ] 

Sergey Shelukhin commented on HBASE-7236:
-

bq. Attributes as in TTL, MIN_VERSIONS, which are part of the hbase data model
Actually, these are not part of data model, this is config.
For many existing metadata settings, it makes perfect sense to have a 
cluster-wide value (e.g. TTL, versions, compression parameters, etc.). 
IMHO metadata is stuff like IS_ROOT, IS_READONLY, etc.
So we have 3 types of values, excluding schema:
1) Metadata specific to column/table, e.g. IS_ROOT.
2) Config:
  a) Some of config is xml-file only and has names like hbase.foo.bar.baz.
  b) Some config is descriptor-only and has names like MIN_VERSIONS.
  c) Some config is both xml and column-specific, and has names like 
hbase.region.max.filesize (or something) *and* MAX_FILESIZE.
3) User-specific parameters, set via CONFIG in shell, and used for mysterious 
user-specific purposes (right now, CF user parameters can also override xml 
config implicitly, which I am not removing with this patch to keep backward 
compat).

I don't see a reason not to separate all of this. 
This patch cleans up 2c and changes it to a general mechanism to do such 
things, instead of case-by-case basis which as shown above leads to more code 
and potential bugs.

Ideally, I'd say we should:
* remove 2b in a sense that all the config values should be specifiable in 
hbase.foo.bar.baz form, and both in XML file (e.g. for example 
hbase.cf.min.versions), and table/cf (where it makes sense; e.g. we can 
whitelist overridable keys).
* Separate (1) from (3) to prevent potential of conflicts if we add new 
reserved keywords that user has set on his descriptors for their own purposes, 
but that's a separate issue.

bq. Is putting arbitrary byte[]'s into the table/column descriptors a valid use 
case for the clients. 
Yes, there were JIRAs about improving it so I assume it's used :)

 add per-table/per-cf configuration via metadata
 ---

 Key: HBASE-7236
 URL: https://issues.apache.org/jira/browse/HBASE-7236
 Project: HBase
  Issue Type: New Feature
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
 HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
 HBASE-7236-v2.patch, HBASE-7236-v3.patch


 Regardless of the compaction policy, it makes sense to have separate 
 configuration for compactions for different tables and column families, as 
 their access patterns and workloads can be different. In particular, for 
 tiered compactions that are being ported from 0.89-fb branch it is necessary 
 to have, to use it properly.
 We might want to add support for compaction configuration via metadata on 
 table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4791) Allow Secure Zookeeper JAAS configuration to be programmatically set (rather than only by reading JAAS configuration file)

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531402#comment-13531402
 ] 

Hadoop QA commented on HBASE-4791:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560824/HBASE-4791-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
104 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 23 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide
  
org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3532//console

This message is automatically generated.

 Allow Secure Zookeeper JAAS configuration to be programmatically set (rather 
 than only by reading JAAS configuration file)
 --

 Key: HBASE-4791
 URL: https://issues.apache.org/jira/browse/HBASE-4791
 Project: HBase
  Issue Type: Improvement
  Components: security, Zookeeper
Reporter: Eugene Koontz
Assignee: Matteo Bertozzi
  Labels: security, zookeeper
 Attachments: DemoConfig.java, HBASE-4791-v1.patch, 
 HBASE-4791-v2.patch, HBASE-4791-v3.patch, HBASE-4791-v4-0.94.patch, 
 HBASE-4791-v4.patch, HBASE-4791-v4.patch


 In the currently proposed fix for HBASE-2418, there must be a JAAS file 
 specified in System.setProperty(java.security.auth.login.config). 
 However, it might be preferable to construct a JAAS configuration 
 programmatically, as is done with secure Hadoop (see 
 https://github.com/apache/hadoop-common/blob/a48eceb62c9b5c1a5d71ee2945d9eea2ed62527b/src/java/org/apache/hadoop/security/UserGroupInformation.java#L175).
 This would have the benefit of avoiding a usage of a system property setting, 
 and allow instead an HBase-local configuration setting. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7343) Fix flaky condition for TestDrainingServer

2012-12-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7343:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

 Fix flaky condition for TestDrainingServer
 --

 Key: HBASE-7343
 URL: https://issues.apache.org/jira/browse/HBASE-7343
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Fix For: 0.94.4

 Attachments: HBASE-7343.patch


 The assert statement in setUpBeforeClass() may fail in case the region 
 distribution is not even (a particular rs has 0 regions).
 {code}
 junit.framework.AssertionFailedError
   at junit.framework.Assert.fail(Assert.java:48)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertFalse(Assert.java:34)
   at junit.framework.Assert.assertFalse(Assert.java:41)
   at 
 org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83)
 {code}
 This is already fixed in trunk with HBASE-5992, but as that's a bigger change 
 and uses 5877, this jira fixes that issue instead of backporting 5992.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7346) Restored snapshot replay problem

2012-12-13 Thread Jonathan Hsieh (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531404#comment-13531404
 ] 

Jonathan Hsieh commented on HBASE-7346:
---

This crash I'm talking about is after the restore has successfully completed 
(after step 4).  

Does the disable table flushing + log rolling guarantee that logs are moved out 
of the way so that they won't be replayed?  

Let's double check and add a test case for this.  That is the easiest way to 
convince that this is or isn't a problem on the offline or online cases.




 Restored snapshot replay problem
 

 Key: HBASE-7346
 URL: https://issues.apache.org/jira/browse/HBASE-7346
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master, regionserver, snapshots, Zookeeper
Affects Versions: hbase-6055
Reporter: Jonathan Hsieh
Priority: Critical
 Fix For: hbase-6055


 The situation is a coarse-grained problem. The key problem is that writes 
 that shouldn't be replayed (since they don't belong to the restored image), 
 would not normally get replayed, but would potentially get replayed if 
 recovery was triggered.
 Previously, without restore, we could depend on the timestamps – if something 
 was replayed but there was newer data, the newer data would win. In a restore 
 situation, the newer data is has the old time stamps from before recovery, 
 and new data that shouldn't get replayed could be.
 ex: 
 1) write 100 rows
 2) ss1 (with logs)
 3) write 50 rows
 4) restore ss1
 5) crash
 6) writes from 1 and 3 both get replayed in log splitting recovery. Oops.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice (not configurable by cf or dynamically)

2012-12-13 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7055:


Attachment: HBASE-7055-v4.patch

Addressed Ted's feedback; that includes moving some code into base class

 port HBASE-6371 tier-based compaction from 0.89-fb to trunk - first slice 
 (not configurable by cf or dynamically)
 -

 Key: HBASE-7055
 URL: https://issues.apache.org/jira/browse/HBASE-7055
 Project: HBase
  Issue Type: Task
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Fix For: 0.96.0

 Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
 HBASE-6371-v3-refactor-only-squashed.patch, 
 HBASE-6371-v4-refactor-only-squashed.patch, 
 HBASE-6371-v5-refactor-only-squashed.patch, HBASE-7055-v0.patch, 
 HBASE-7055-v1.patch, HBASE-7055-v2.patch, HBASE-7055-v3.patch, 
 HBASE-7055-v4.patch


 There's divergence in the code :(
 See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7233) Serializing KeyValues

2012-12-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531437#comment-13531437
 ] 

Andrew Purtell commented on HBASE-7233:
---

{quote}
bq. I would really prefer not to double the number of kV types just to say foo 
with tags. And then double again for foo with tags and bar.
That would be ugly, but at the same time it's difficult and maybe wasteful to 
future-proof it from every angle. Tags are already sort of a flexible 
future-proofing mechanism. Maybe tags can be added in a backwards compatible 
way to the existing encoders. I'd have to think about it for PrefixTree, 
probably punting them to a PREFIX_TREE2 encoder with some other 
additions/improvements.
{quote}

The use case I'm looking at is adding security policy information to KVs 
(HBASE-6222), could be either ACLs or visibility labels, both can be handled 
the same way. There's a 1:1 mapping, so it makes sense to store the policy 
information in the KV. This also has the nice property of reading in the ACL 
for free in the same op that reads in the KV. I'm not asking for specifically 
more than tagging KVs with this specific metadata but, given that tags could be 
easily made generic enough to support a number of other cases, I think it makes 
sense to do that. Then security is just one user of something more generally 
useful, we haven't done something fixed for security's sake only.

Adding tag support to the encoders might be the right answer. Would we still 
have the trouble of teaching KeyValue about where in the bytebuffers coming out 
of the encoder the tag data resides? Any thoughts on how we might distinguish a 
KV with tags from one without? Maybe we don't, we just have the encoder add the 
discovered tag data to the KV by way of an API that adds out of band metadata 
to the KV's in memory representation? And likewise add tags to the blocks 
beyond the KV itself if they are present?

 Serializing KeyValues
 -

 Key: HBASE-7233
 URL: https://issues.apache.org/jira/browse/HBASE-7233
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 7233sketch.txt, 7233.txt, 7233-v2.txt, 
 7233v3_encoders.txt, 7233v4_encoders.txt, 7233v5_encoders.txt, 
 7233v6_encoder.txt


 Undo KeyValue being a Writable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531438#comment-13531438
 ] 

stack commented on HBASE-7340:
--

As the issue is cast now, its fine w/ appropriate priority.  Original 
justification was way wonky citing a mail exchange though therein there was no 
call such a 'task/feature' (And then the 'task/feature' becomes a 'bug' on 
questioning).  All this churn and triviality distracts.  There are bigger fish 
to fry.

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Attachments: 7340-v1.txt, 7340-v2.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5416) Improve performance of scans with some kind of filters.

2012-12-13 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531492#comment-13531492
 ] 

Sergey Shelukhin commented on HBASE-5416:
-

Hmm, no... let me try with increased. Should this setting be set in the patch 
too? This test is going to run with other tests every time I assume.

 Improve performance of scans with some kind of filters.
 ---

 Key: HBASE-5416
 URL: https://issues.apache.org/jira/browse/HBASE-5416
 Project: HBase
  Issue Type: Improvement
  Components: Filters, Performance, regionserver
Affects Versions: 0.90.4
Reporter: Max Lapan
Assignee: Max Lapan
 Fix For: 0.96.0

 Attachments: 5416-Filtered_scans_v6.patch, 5416-v5.txt, 5416-v6.txt, 
 Filtered_scans.patch, Filtered_scans_v2.patch, Filtered_scans_v3.patch, 
 Filtered_scans_v4.patch, Filtered_scans_v5.1.patch, Filtered_scans_v5.patch, 
 Filtered_scans_v7.patch


 When the scan is performed, whole row is loaded into result list, after that 
 filter (if exists) is applied to detect that row is needed.
 But when scan is performed on several CFs and filter checks only data from 
 the subset of these CFs, data from CFs, not checked by a filter is not needed 
 on a filter stage. Only when we decided to include current row. And in such 
 case we can significantly reduce amount of IO performed by a scan, by loading 
 only values, actually checked by a filter.
 For example, we have two CFs: flags and snap. Flags is quite small (bunch of 
 megabytes) and is used to filter large entries from snap. Snap is very large 
 (10s of GB) and it is quite costly to scan it. If we needed only rows with 
 some flag specified, we use SingleColumnValueFilter to limit result to only 
 small subset of region. But current implementation is loading both CFs to 
 perform scan, when only small subset is needed.
 Attached patch adds one routine to Filter interface to allow filter to 
 specify which CF is needed to it's operation. In HRegion, we separate all 
 scanners into two groups: needed for filter and the rest (joined). When new 
 row is considered, only needed data is loaded, filter applied, and only if 
 filter accepts the row, rest of data is loaded. At our data, this speeds up 
 such kind of scans 30-50 times. Also, this gives us the way to better 
 normalize the data into separate columns by optimizing the scans performed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7236) add per-table/per-cf configuration via metadata

2012-12-13 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531494#comment-13531494
 ] 

Enis Soztutar commented on HBASE-7236:
--

bq. Actually, these are not part of data model, this is config.
Well, TTL, and MIN_VERSIONS are actually part of the data model in that, 
together with the timestamp in KV, they define the semantics of the data 
lifetime. Compactions / compression / data block encoding are pure 
configurations that don't affect the semantics, but rather performance/data 
size. 

bq. I don't see a reason not to separate all of this. 
Separating makes sense. 

 add per-table/per-cf configuration via metadata
 ---

 Key: HBASE-7236
 URL: https://issues.apache.org/jira/browse/HBASE-7236
 Project: HBase
  Issue Type: New Feature
  Components: Compaction
Affects Versions: 0.96.0
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, 
 HBASE-7236-PROTOTYPE-v1.patch, HBASE-7236-v0.patch, HBASE-7236-v1.patch, 
 HBASE-7236-v2.patch, HBASE-7236-v3.patch


 Regardless of the compaction policy, it makes sense to have separate 
 configuration for compactions for different tables and column families, as 
 their access patterns and workloads can be different. In particular, for 
 tiered compactions that are being ported from 0.89-fb branch it is necessary 
 to have, to use it properly.
 We might want to add support for compaction configuration via metadata on 
 table/cf.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531495#comment-13531495
 ] 

Hadoop QA commented on HBASE-7340:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560837/7340-v2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
104 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 23 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3533//console

This message is automatically generated.

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Attachments: 7340-v1.txt, 7340-v2.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531500#comment-13531500
 ] 

Hudson commented on HBASE-7336:
---

Integrated in HBase-0.94 #625 (See 
[https://builds.apache.org/job/HBase-0.94/625/])
HBASE-7336 HFileBlock.readAtOffset does not work well with multiple threads 
(Revision 1421439)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java


 HFileBlock.readAtOffset does not work well with multiple threads
 

 Key: HBASE-7336
 URL: https://issues.apache.org/jira/browse/HBASE-7336
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7336-0.94.txt, 7336-0.96.txt


 HBase grinds to a halt when many threads scan along the same set of blocks 
 and neither read short circuit is nor block caching is enabled for the dfs 
 client ... disabling the block cache makes sense on very large scans.
 It turns out that synchronizing in istream in HFileBlock.readAtOffset is the 
 culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7341) Deprecate RowLocks in 0.94

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531501#comment-13531501
 ] 

Hudson commented on HBASE-7341:
---

Integrated in HBase-0.94 #625 (See 
[https://builds.apache.org/job/HBase-0.94/625/])
HBASE-7341 Deprecate RowLocks in 0.94 (Revision 1421447)

 Result = SUCCESS
gchanan : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/UnknownRowLockException.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Delete.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Get.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/HTablePool.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Increment.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Put.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/RowLock.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java


 Deprecate RowLocks in 0.94
 --

 Key: HBASE-7341
 URL: https://issues.apache.org/jira/browse/HBASE-7341
 Project: HBase
  Issue Type: Task
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.94.4

 Attachments: HBASE-7341.patch


 Since we are removing support in 0.96 (see HBASE-7315), we should deprecate 
 in 0.94.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7343) Fix flaky condition for TestDrainingServer

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531502#comment-13531502
 ] 

Hudson commented on HBASE-7343:
---

Integrated in HBase-0.94 #625 (See 
[https://builds.apache.org/job/HBase-0.94/625/])
HBASE-7343 Fix flaky condition for TestDrainingServer (Himanshu) (Revision 
1421455)

 Result = SUCCESS
jxiang : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestDrainingServer.java


 Fix flaky condition for TestDrainingServer
 --

 Key: HBASE-7343
 URL: https://issues.apache.org/jira/browse/HBASE-7343
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Fix For: 0.94.4

 Attachments: HBASE-7343.patch


 The assert statement in setUpBeforeClass() may fail in case the region 
 distribution is not even (a particular rs has 0 regions).
 {code}
 junit.framework.AssertionFailedError
   at junit.framework.Assert.fail(Assert.java:48)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at junit.framework.Assert.assertFalse(Assert.java:34)
   at junit.framework.Assert.assertFalse(Assert.java:41)
   at 
 org.apache.hadoop.hbase.TestDrainingServer.setUpBeforeClass(TestDrainingServer.java:83)
 {code}
 This is already fixed in trunk with HBASE-5992, but as that's a bigger change 
 and uses 5877, this jira fixes that issue instead of backporting 5992.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7336) HFileBlock.readAtOffset does not work well with multiple threads

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531506#comment-13531506
 ] 

Hudson commented on HBASE-7336:
---

Integrated in HBase-TRUNK #3618 (See 
[https://builds.apache.org/job/HBase-TRUNK/3618/])
HBASE-7336 HFileBlock.readAtOffset does not work well with multiple threads 
(Revision 1421440)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java


 HFileBlock.readAtOffset does not work well with multiple threads
 

 Key: HBASE-7336
 URL: https://issues.apache.org/jira/browse/HBASE-7336
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Critical
 Fix For: 0.96.0, 0.94.4

 Attachments: 7336-0.94.txt, 7336-0.96.txt


 HBase grinds to a halt when many threads scan along the same set of blocks 
 and neither read short circuit is nor block caching is enabled for the dfs 
 client ... disabling the block cache makes sense on very large scans.
 It turns out that synchronizing in istream in HFileBlock.readAtOffset is the 
 culprit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7338) Fix flaky condition for org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange

2012-12-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531507#comment-13531507
 ] 

Hudson commented on HBASE-7338:
---

Integrated in HBase-TRUNK #3618 (See 
[https://builds.apache.org/job/HBase-TRUNK/3618/])
HBASE-7338 Fix flaky condition for 
org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
 (Himanshu) (Revision 1421444)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java


 Fix flaky condition for 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange
 -

 Key: HBASE-7338
 URL: https://issues.apache.org/jira/browse/HBASE-7338
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.94.3, 0.96.0
Reporter: Himanshu Vashishtha
Assignee: Himanshu Vashishtha
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-7338.patch


 The balancer doesn't run in case a region is in-transition. The check to 
 confirm whether there all regions are assigned looks for region count  22, 
 where the total regions are 27. This may result in a failure:
 {code}
 java.lang.AssertionError: After 5 attempts, region assignments were not 
 balanced.
   at org.junit.Assert.fail(Assert.java:93)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.assertRegionsAreBalanced(TestRegionRebalancing.java:203)
   at 
 org.apache.hadoop.hbase.TestRegionRebalancing.testRebalanceOnRegionServerNumberChange(TestRegionRebalancing.java:123)
 .
 2012-12-11 13:47:02,231 INFO  [pool-1-thread-1] 
 hbase.TestRegionRebalancing(120): Added fourth 
 server=p0118.mtv.cloudera.com,44414,1355262422083
 2012-12-11 13:47:02,231 INFO  
 [RegionServer:3;p0118.mtv.cloudera.com,44414,1355262422083] 
 regionserver.HRegionServer(3769): Registered RegionServer MXBean
 2012-12-11 13:47:02,231 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {c786446fb2542f190e937057cdc79d9d=test,kkk,1355262401365.c786446fb2542f190e937057cdc79d9d.
  state=OPENING, ts=1355262421037, 
 server=p0118.mtv.cloudera.com,54281,1355262419765}
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(165): There are 4 servers and 26 regions. Load 
 Average: 13.0 low border: 9, up border: 16; attempt: 0
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,51590,1355262395329 
 Avg: 13.0 actual: 11
 2012-12-11 13:47:02,232 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,52987,1355262407916 
 Avg: 13.0 actual: 15
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(171): p0118.mtv.cloudera.com,48044,1355262421787 
 Avg: 13.0 actual: 0
 2012-12-11 13:47:02,233 DEBUG [pool-1-thread-1] 
 hbase.TestRegionRebalancing(179): p0118.mtv.cloudera.com,48044,1355262421787 
 Isn't balanced!!! Avg: 13.0 actual: 0 slop: 0.2
 2012-12-11 13:47:12,233 DEBUG [pool-1-thread-1] master.HMaster(987): Not 
 running balancer because 1 region(s) in transition: 
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7340) Master coprocessor notification for assignmentManager.balance() is inconsistent

2012-12-13 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-7340:
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Integrated to trunk.

Thanks for the review, Andy.

 Master coprocessor notification for assignmentManager.balance() is 
 inconsistent
 ---

 Key: HBASE-7340
 URL: https://issues.apache.org/jira/browse/HBASE-7340
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu
Priority: Trivial
 Fix For: 0.96.0

 Attachments: 7340-v1.txt, 7340-v2.txt


 I found this issue when reading user discussion which is quoted below.
 In HMaster.moveRegion(), we have:
 {code}
   this.assignmentManager.balance(rp);
   if (this.cpHost != null) {
 this.cpHost.postMove(hri, rp.getSource(), rp.getDestination());
   }
 {code}
 Meaning, user can register master coprocessor which would receive region 
 movement notification.
 The assignmentManager.balance(plan) call in HMaster.balance() doesn't send 
 out such notification.
 I think we should enhance the following hook (at line 1335) with list of 
 regions moved so that notification from master is consistent:
 {code}
 this.cpHost.postBalance();
 {code}
 Here is excerpt for user discussion:
 Sometimes user performs compaction after a region is moved (by balancer). We 
 should provide 'hook' which lets user specify what follow-on actions to take 
 after region movement.
 See discussion on user mailing list under the thread 'How to know it's time 
 for a major compaction?' for background information: 
 http://search-hadoop.com/m/BDx4S1jMjF92subj=How+to+know+it+s+time+for+a+major+compaction+

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock

2012-12-13 Thread Shrijeet Paliwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531533#comment-13531533
 ] 

Shrijeet Paliwal commented on HBASE-5898:
-

After putting UseMembar in GC opts we have not seen the previously reported 
deadlock. Just wanted to update. 

 Consider double-checked locking for block cache lock
 

 Key: HBASE-5898
 URL: https://issues.apache.org/jira/browse/HBASE-5898
 Project: HBase
  Issue Type: Improvement
  Components: Performance
Affects Versions: 0.94.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: 5898-0.94.txt, 5898-TestBlocksRead.txt, 5898-v2.txt, 
 5898-v3.txt, 5898-v4.txt, 5898-v4.txt, HBASE-5898-0.patch, 
 HBASE-5898-1.patch, HBASE-5898-1.patch, hbase-5898.txt


 Running a workload with a high query rate against a dataset that fits in 
 cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by 
 HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a 
 lot of CPU doing lock management here. I wrote a quick patch to switch to a 
 double-checked locking and it improved throughput substantially for this 
 workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7299) TestMultiParallel fails intermittently in trunk builds

2012-12-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531543#comment-13531543
 ] 

Ted Yu commented on HBASE-7299:
---

I looped TestMultiParallel 9 times in 0.94 branch and didn't see test failure.

Looks like the failure is unique to trunk.

 TestMultiParallel fails intermittently in trunk builds
 --

 Key: HBASE-7299
 URL: https://issues.apache.org/jira/browse/HBASE-7299
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Priority: Critical
 Fix For: 0.96.0


 From trunk build #3598:
 {code}
  testFlushCommitsNoAbort(org.apache.hadoop.hbase.client.TestMultiParallel): 
 Count of regions=8
 {code}
 It failed in 3595 as well:
 {code}
 java.lang.AssertionError: Server count=2, abort=true expected:1 but was:2
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at 
 org.apache.hadoop.hbase.client.TestMultiParallel.doTestFlushCommits(TestMultiParallel.java:267)
   at 
 org.apache.hadoop.hbase.client.TestMultiParallel.testFlushCommitsWithAbort(TestMultiParallel.java:226)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5898) Consider double-checked locking for block cache lock

2012-12-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531555#comment-13531555
 ] 

Lars Hofhansl commented on HBASE-5898:
--

Thanks Shrijeet. How reliably have you seen this before (Once a day, once a 
mongth, etc)?

If this really causes issues we should:
# ship with -XX:+UseMembar by default in hbase-env.sh
# document that this must be set

Are we confident enough this to do that?


 Consider double-checked locking for block cache lock
 

 Key: HBASE-5898
 URL: https://issues.apache.org/jira/browse/HBASE-5898
 Project: HBase
  Issue Type: Improvement
  Components: Performance
Affects Versions: 0.94.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: 5898-0.94.txt, 5898-TestBlocksRead.txt, 5898-v2.txt, 
 5898-v3.txt, 5898-v4.txt, 5898-v4.txt, HBASE-5898-0.patch, 
 HBASE-5898-1.patch, HBASE-5898-1.patch, hbase-5898.txt


 Running a workload with a high query rate against a dataset that fits in 
 cache, I saw a lot of CPU being used in IdLock.getLockEntry, being called by 
 HFileReaderV2.readBlock. Even though it was all cache hits, it was wasting a 
 lot of CPU doing lock management here. I wrote a quick patch to switch to a 
 double-checked locking and it improved throughput substantially for this 
 workload.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >