date:20120910

[
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451774#comment-13451774
]

Hadoop QA commented on HBASE-6592:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544431/hbase-6592-v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestShell

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2835//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2835//console

This message is automatically generated.

[shell] Add means of custom formatting output by column
---

Key: HBASE-6592
URL: https://issues.apache.org/jira/browse/HBASE-6592
Project: HBase
Issue Type: New Feature
Components: shell
Reporter: stack
Priority: Minor
Labels: noob
Attachments: hbase-6592.patch, hbase-6592-v2.patch,
hbase-6952-v1.patch

See Jacques suggestion toward end of this thread for how we should allow
adding a custom formatter per column to use outputting column content in
shell:
http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6592) [shell] Add means of custom formatting output by column

2012-09-10 Thread Jie Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-6592:
-

Attachment: (was: hbase-6592-v2.patch)

 [shell] Add means of custom formatting output by column
 ---

 Key: HBASE-6592
 URL: https://issues.apache.org/jira/browse/HBASE-6592
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: stack
Priority: Minor
  Labels: noob
 Attachments: hbase-6592.patch, hbase-6592-v2.patch, 
 hbase-6952-v1.patch


 See Jacques suggestion toward end of this thread for how we should allow 
 adding a custom formatter per column to use outputting column content in 
 shell: 
 http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6592) [shell] Add means of custom formatting output by column

2012-09-10 Thread Jie Huang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Huang updated HBASE-6592:
-

Attachment: hbase-6592-v2.patch

 [shell] Add means of custom formatting output by column
 ---

 Key: HBASE-6592
 URL: https://issues.apache.org/jira/browse/HBASE-6592
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: stack
Priority: Minor
  Labels: noob
 Attachments: hbase-6592.patch, hbase-6592-v2.patch, 
 hbase-6952-v1.patch


 See Jacques suggestion toward end of this thread for how we should allow 
 adding a custom formatter per column to use outputting column content in 
 shell: 
 http://search-hadoop.com/m/2WxUB1fuxL11/Printing+integers+in+the+Hbase+shellsubj=Printing+integers+in+the+Hbase+shell

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6592) [shell] Add means of custom formatting output by column

[
https://issues.apache.org/jira/browse/HBASE-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451794#comment-13451794
]

Hadoop QA commented on HBASE-6592:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544435/hbase-6592-v2.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2836//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2836//console

This message is automatically generated.

[shell] Add means of custom formatting output by column
---

[jira] [Commented] (HBASE-6746) Impacts of HBASE-6435 vs. HDFS 2.0 trunk


[ 
https://issues.apache.org/jira/browse/HBASE-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451856#comment-13451856
 ] 

nkeywal commented on HBASE-6746:


Committed revision 1382723.


 Impacts of HBASE-6435 vs. HDFS 2.0 trunk
 

 Key: HBASE-6746
 URL: https://issues.apache.org/jira/browse/HBASE-6746
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 6746.v1.patch


 When using the trunk of HDFS branch 2, I had two errors linked to HBASE-6435:
 - a missing test to null
 - a method removed. 
 This fixes it:
 - add the test
 - make the test case less dependant on HDFS internal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-6742) Change default test parallelisation level to 5


 [ 
https://issues.apache.org/jira/browse/HBASE-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal resolved HBASE-6742.


Resolution: Fixed

 Change default test parallelisation level to 5
 --

 Key: HBASE-6742
 URL: https://issues.apache.org/jira/browse/HBASE-6742
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Fix For: 0.96.0

 Attachments: hbase-6742.v1.patch


 Tests will be faster.
 Not visible if a test hangs for 15 minutes. But they should not hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-10 Thread Priyadarshini (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Priyadarshini updated HBASE-6698:
-

Attachment: HBASE-6698_6.patch

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies

2012-09-10 Thread rajeshbabu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451876#comment-13451876
 ] 

rajeshbabu commented on HBASE-6438:
---

@Ted

When I ran test suite in my local below test cases are always(without this 
patch also) failing because of environment problems. I ran failed tests 
individually in our jenkins multiple times. They are always passing.

{code}
Failed tests:   testPermMask(org.apache.hadoop.hbase.util.TestFSUtils): 
expected:rwx-- but was:rwxrwxrwx

Tests in error: 
  
testCacheOnWriteInSchema[1](org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema):
 Target HLog directory already exists: 
/mnt/F/hbase94Com/target/test-data/8a5bb561-edfc-4fab-9358-7ab726cb44fc/TestCacheOnWriteInSchema/logs
  
testCacheOnWriteInSchema[2](org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema):
 Target HLog directory already exists: 
/mnt/F/hbase94Com/target/test-data/8a5bb561-edfc-4fab-9358-7ab726cb44fc/TestCacheOnWriteInSchema/logs
  
testWholesomeSplit(org.apache.hadoop.hbase.regionserver.TestSplitTransaction): 
Failed delete of 
/mnt/F/hbase94Com/target/test-data/9d7234b4-1f6a-42a7-bbb1-641eb464b7e6/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/table/4bbe087ebab2243b8b9633bb3d870f4c
  testRollback(org.apache.hadoop.hbase.regionserver.TestSplitTransaction): 
Failed delete of 
/mnt/F/hbase94Com/target/test-data/4afca7c8-ee29-47fb-b660-f2ee661bced7/org.apache.hadoop.hbase.regionserver.TestSplitTransaction/table/ad08ee3070175df954844582816d5927
  
testOffPeakCompactionRatio(org.apache.hadoop.hbase.regionserver.TestCompactSelection):
 Target HLog directory already exists: 
/mnt/F/hbase94Com/target/test-data/dd6ca8f4-4321-42d8-825b-fc6a42ab84c0/TestCompactSelection/logs

Tests run: 1590, Failures: 1, Errors: 5, Skipped: 12


Running org.apache.hadoop.hbase.regionserver.TestSplitTransaction
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.383 sec
Results :
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0

Running org.apache.hadoop.hbase.regionserver.TestCompactSelection
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.264 sec
Results :
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0

Running org.apache.hadoop.hbase.regionserver.TestCacheOnWriteInSchema
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.564 sec
Results :
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0

Running org.apache.hadoop.hbase.util.TestFSUtils
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.43 sec
Results :
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
{code}

 RegionAlreadyInTransitionException needs to give more info to avoid 
 assignment inconsistencies
 --

 Key: HBASE-6438
 URL: https://issues.apache.org/jira/browse/HBASE-6438
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Assignee: rajeshbabu
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: HBASE-6438_2.patch, HBASE-6438_94.patch, 
 HBASE-6438_trunk.patch


 Seeing some of the recent issues in region assignment, 
 RegionAlreadyInTransitionException is one reason after which the region 
 assignment may or may not happen(in the sense we need to wait for the TM to 
 assign).
 In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on 
 master restart.
 Consider the following case, due to some reason like master restart or 
 external assign call, we try to assign a region that is already getting 
 opened in a RS.
 Now the next call to assign has already changed the state of the znode and so 
 the current assign that is going on the RS is affected and it fails.  The 
 second assignment that started also fails getting RAITE exception.  Finally 
 both assignments not carrying on.  Idea is to find whether any such RAITE 
 exception can be retried or not.
 Here again we have following cases like where
 - The znode is yet to transitioned from OFFLINE to OPENING in RS
 - RS may be in the step of openRegion.
 - RS may be trying to transition OPENING to OPENED.
 - RS is yet to add to online regions in the RS side.
 Here in openRegion() and updateMeta() any failures we are moving the znode to 
 FAILED_OPEN.  So in these cases getting an RAITE should be ok.  But in other 
 cases the assignment is stopped.
 The idea is to just add the current state of the region assignment in the RIT 
 map in the RS side and using that info we can determine whether the 
 assignment can be retried or not on getting an RAITE.
 Considering the current work going on in AM, pls do share if this is needed 
 atleast in the 0.92/0.94 versions?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-10 Thread Priyadarshini (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451881#comment-13451881
 ] 

Priyadarshini commented on HBASE-6698:
--

Latest patch has renamed putsAndLocks to mutateWithLocks

{code}
Delete delete = new Delete(new byte[0]);
{code}

Now, delete tries to obtain row lock through mutation.getRow(). To get the row 
lock, for test case purpose this change is required.This delete api is used 
only in testcases.

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6746) Impacts of HBASE-6435 vs. HDFS 2.0 trunk


[ 
https://issues.apache.org/jira/browse/HBASE-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451886#comment-13451886
 ] 

Hudson commented on HBASE-6746:
---

Integrated in HBase-TRUNK #3320 (See 
[https://builds.apache.org/job/HBase-TRUNK/3320/])
HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java


 Impacts of HBASE-6435 vs. HDFS 2.0 trunk
 

 Key: HBASE-6746
 URL: https://issues.apache.org/jira/browse/HBASE-6746
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 6746.v1.patch


 When using the trunk of HDFS branch 2, I had two errors linked to HBASE-6435:
 - a missing test to null
 - a method removed. 
 This fixes it:
 - add the test
 - make the test case less dependant on HDFS internal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes


[ 
https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451887#comment-13451887
 ] 

Hudson commented on HBASE-6435:
---

Integrated in HBase-TRUNK #3320 (See 
[https://builds.apache.org/job/HBase-TRUNK/3320/])
HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java


 Reading WAL files after a recovery leads to time lost in HDFS timeouts when 
 using dead datanodes
 

 Key: HBASE-6435
 URL: https://issues.apache.org/jira/browse/HBASE-6435
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 6435.unfinished.patch, 6435.v10.patch, 6435.v10.patch, 
 6435.v12.patch, 6435.v12.patch, 6435.v12.patch, 6435-v12.txt, 6435.v13.patch, 
 6435.v14.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 
 6435.v9.patch, 6535.v11.patch


 HBase writes a Write-Ahead-Log to revover from hardware failure. This log is 
 written on hdfs.
 Through ZooKeeper, HBase gets informed usually in 30s that it should start 
 the recovery process. 
 This means reading the Write-Ahead-Log to replay the edits on the other 
 servers.
 In standards deployments, HBase process (regionserver) are deployed on the 
 same box as the datanodes.
 It means that when the box stops, we've actually lost one of the edits, as we 
 lost both the regionserver and the datanode.
 As HDFS marks a node as dead after ~10 minutes, it appears as available when 
 we try to read the blocks to recover. As such, we are delaying the recovery 
 process by 60 seconds as the read will usually fail with a socket timeout. If 
 the file is still opened for writing, it adds an extra 20s + a risk of losing 
 edits if we connect with ipc to the dead DN.
 Possible solutions are:
 - shorter dead datanodes detection by the NN. Requires a NN code change.
 - better dead datanodes management in DFSClient. Requires a DFS code change.
 - NN customisation to write the WAL files on another DN instead of the local 
 one.
 - reordering the blocks returned by the NN on the client side to put the 
 blocks on the same DN as the dead RS at the end of the priority queue. 
 Requires a DFS code change or a kind of workaround.
 The solution retained is the last one. Compared to what was discussed on the 
 mailing list, the proposed patch will not modify HDFS source code but adds a 
 proxy. This for two reasons:
 - Some HDFS functions managing block orders are static 
 (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would 
 require to implement partially the fix, change the DFS interface to make this 
 function non static, or put the hook static. None of these solution is very 
 clean. 
 - Adding a proxy allows to put all the code in HBase, simplifying dependency 
 management.
 Nevertheless, it would be better to have this in HDFS. But this solution 
 allows to target the last version only, and this could allow minimal 
 interface changes such as non static methods.
 Moreover, writing the blocks to the non local DN would be an even better 
 solution long term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

[
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451909#comment-13451909
]

Hadoop QA commented on HBASE-6698:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/1258/HBASE-6698_6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestFromClientSide

org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor

org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort
org.apache.hadoop.hbase.master.TestAssignmentManager
org.apache.hadoop.hbase.master.TestSplitLogManager

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2837//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2837//console

This message is automatically generated.

Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
--

Key: HBASE-6698
URL: https://issues.apache.org/jira/browse/HBASE-6698
Project: HBase
Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Fix For: 0.96.0

Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch,
HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, HBASE-6698.patch

Currently the checkAndPut and checkAndDelete api internally calls the
internalPut and internalDelete. May be we can just call doMiniBatchMutation
only. This will help in future like if we have some hooks and the CP
handles certain cases in the doMiniBatchMutation the same can be done while
doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

[jira] [Commented] (HBASE-6746) Impacts of HBASE-6435 vs. HDFS 2.0 trunk


[ 
https://issues.apache.org/jira/browse/HBASE-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451912#comment-13451912
 ] 

Hudson commented on HBASE-6746:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #168 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/168/])
HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java


 Impacts of HBASE-6435 vs. HDFS 2.0 trunk
 

 Key: HBASE-6746
 URL: https://issues.apache.org/jira/browse/HBASE-6746
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, test
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 6746.v1.patch


 When using the trunk of HDFS branch 2, I had two errors linked to HBASE-6435:
 - a missing test to null
 - a method removed. 
 This fixes it:
 - add the test
 - make the test case less dependant on HDFS internal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5631) hbck should handle case where .tableinfo file is missing.


[ 
https://issues.apache.org/jira/browse/HBASE-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451913#comment-13451913
 ] 

Hudson commented on HBASE-5631:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #168 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/168/])
HBASE-5631 ADDENDUM (extra comments) (Revision 1382627)

 Result = FAILURE
jmhsieh : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java


 hbck should handle case where .tableinfo file is missing.
 -

 Key: HBASE-5631
 URL: https://issues.apache.org/jira/browse/HBASE-5631
 Project: HBase
  Issue Type: Improvement
  Components: hbck
Affects Versions: 0.92.2, 0.94.0, 0.96.0
Reporter: Jonathan Hsieh
Assignee: Jie Huang
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: hbase-5631-addendum.patch, hbase-5631.patch, 
 hbase-5631-v1.patch, hbase-5631-v2.patch


 0.92+ branches have a .tableinfo file which could be missing from hdfs.  hbck 
 should be able to detect and repair this properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6435) Reading WAL files after a recovery leads to time lost in HDFS timeouts when using dead datanodes


[ 
https://issues.apache.org/jira/browse/HBASE-6435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451914#comment-13451914
 ] 

Hudson commented on HBASE-6435:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #168 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/168/])
HBASE-6746 Impacts of HBASE-6435 vs. HDFS 2.0 trunk (Revision 1382723)

 Result = FAILURE
nkeywal : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/fs/TestBlockReorder.java


 Reading WAL files after a recovery leads to time lost in HDFS timeouts when 
 using dead datanodes
 

 Key: HBASE-6435
 URL: https://issues.apache.org/jira/browse/HBASE-6435
 Project: HBase
  Issue Type: Improvement
  Components: master, regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal
 Fix For: 0.96.0

 Attachments: 6435.unfinished.patch, 6435.v10.patch, 6435.v10.patch, 
 6435.v12.patch, 6435.v12.patch, 6435.v12.patch, 6435-v12.txt, 6435.v13.patch, 
 6435.v14.patch, 6435.v2.patch, 6435.v7.patch, 6435.v8.patch, 6435.v9.patch, 
 6435.v9.patch, 6535.v11.patch


 HBase writes a Write-Ahead-Log to revover from hardware failure. This log is 
 written on hdfs.
 Through ZooKeeper, HBase gets informed usually in 30s that it should start 
 the recovery process. 
 This means reading the Write-Ahead-Log to replay the edits on the other 
 servers.
 In standards deployments, HBase process (regionserver) are deployed on the 
 same box as the datanodes.
 It means that when the box stops, we've actually lost one of the edits, as we 
 lost both the regionserver and the datanode.
 As HDFS marks a node as dead after ~10 minutes, it appears as available when 
 we try to read the blocks to recover. As such, we are delaying the recovery 
 process by 60 seconds as the read will usually fail with a socket timeout. If 
 the file is still opened for writing, it adds an extra 20s + a risk of losing 
 edits if we connect with ipc to the dead DN.
 Possible solutions are:
 - shorter dead datanodes detection by the NN. Requires a NN code change.
 - better dead datanodes management in DFSClient. Requires a DFS code change.
 - NN customisation to write the WAL files on another DN instead of the local 
 one.
 - reordering the blocks returned by the NN on the client side to put the 
 blocks on the same DN as the dead RS at the end of the priority queue. 
 Requires a DFS code change or a kind of workaround.
 The solution retained is the last one. Compared to what was discussed on the 
 mailing list, the proposed patch will not modify HDFS source code but adds a 
 proxy. This for two reasons:
 - Some HDFS functions managing block orders are static 
 (MD5MD5CRC32FileChecksum). Implementing the hook in the DFSClient would 
 require to implement partially the fix, change the DFS interface to make this 
 function non static, or put the hook static. None of these solution is very 
 clean. 
 - Adding a proxy allows to put all the code in HBase, simplifying dependency 
 management.
 Nevertheless, it would be better to have this in HDFS. But this solution 
 allows to target the last version only, and this could allow minimal 
 interface changes such as non static methods.
 Moreover, writing the blocks to the non local DN would be an even better 
 solution long term.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6748) Endless recursive of deleteNode happened in SplitLogManager#DeleteAsyncCallback

Jieshan Bean created HBASE-6748:
---

 Summary: Endless recursive of deleteNode happened in 
SplitLogManager#DeleteAsyncCallback
 Key: HBASE-6748
 URL: https://issues.apache.org/jira/browse/HBASE-6748
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.94.1, 0.96.0
Reporter: Jieshan Bean


You can ealily understand the problem from the below logs:
[2012-09-01 11:41:02,062] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=3
[2012-09-01 11:41:02,062] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=2
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=1
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=0
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager 393] failed to create task 
node/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager 353] Error splitting 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775807
[2012-09-01 11:41:02,064] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775806
[2012-09-01 11:41:02,064] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775805
[2012-09-01 11:41:02,064] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775804
[2012-09-01 11:41:02,065] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775803
...
[2012-09-01 11:41:03,307] [ERROR] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.zookeeper.ClientCnxn 623] Caught unexpected throwable
java.lang.StackOverflowError

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6408) Naming and documenting of the hadoop-metrics2.properties file


[ 
https://issues.apache.org/jira/browse/HBASE-6408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451941#comment-13451941
 ] 

Michael Drzal commented on HBASE-6408:
--

+1

 Naming and documenting of the hadoop-metrics2.properties file
 -

 Key: HBASE-6408
 URL: https://issues.apache.org/jira/browse/HBASE-6408
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Blocker
 Attachments: HBASE-6408-0.patch


 hadoop-metrics2.properties is currently where metrics2 loads it's sinks.
 This file could be better named, hadoop-hbase-metrics2.properties
 In addition it needs examples like the current hadoop-metrics.properties has.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)


[ 
https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451947#comment-13451947
 ] 

Michael Drzal commented on HBASE-6419:
--

[~ted_yu] can we close this out?

 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 
 of HBASE-6220)
 ---

 Key: HBASE-6419
 URL: https://issues.apache.org/jira/browse/HBASE-6419
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Paul Cavallaro
 Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6749) Compact one expired HFile all the time

Jieshan Bean created HBASE-6749:
---

 Summary: Compact one expired HFile all the time
 Key: HBASE-6749
 URL: https://issues.apache.org/jira/browse/HBASE-6749
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.1
Reporter: Jieshan Bean
Assignee: Jieshan Bean


It's a interesting issue. We found there's 1 HFile keeped changing its name all 
the time. After dig in more, we found one strange behavior in compaction flow.

Here's the problem(We set the TTL property in our table):

There were 10 HFiles and only 1 expired HFile when this problem occured:

2012-09-07 02:21:05,298 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/221f56905cbd4bf09bd4d5d9dceb113a.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=118730, majorCompaction=false
2012-09-07 02:21:05,309 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/297b45a6c5f541dca05105ab098dab8d,
 isReference=false, isBulkLoadResult=false, seqid=122018, majorCompaction=false
2012-09-07 02:21:05,326 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/4a4a4598bc0443c9be087812052d6796.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=119850, majorCompaction=false
2012-09-07 02:21:05,348 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/8c6d56c9bafb4b0eb0dd6e04e41ca5b7,
 isReference=false, isBulkLoadResult=false, seqid=123135, majorCompaction=false
2012-09-07 02:21:05,357 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a6c2873a646c425f87a6a8a271a9904e,
 isReference=false, isBulkLoadResult=false, seqid=76561, majorCompaction=false
2012-09-07 02:21:05,370 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a712a2f79bd247f48405d2c6a91757ab.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=120951, majorCompaction=false
2012-09-07 02:21:05,381 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a717c63214534cb0aaf8e695147fde46,
 isReference=false, isBulkLoadResult=false, seqid=122763, majorCompaction=false
2012-09-07 02:21:05,431 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/c456f3831b094ac3a0590678acbf27a5.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=120579, majorCompaction=false
2012-09-07 02:21:05,518 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/dde4e56b131a4ffdaec8f9574bffa5ab.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=121651, majorCompaction=false
2012-09-07 02:21:05,593 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/ee62844594f2474e88186dbde673c802.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=119478, majorCompaction=false
 
Compaction was triggered during Region-Opening . As we know, compaction should 
choose the expired HFiles to compact first, so that 1 expired HFile was 
choosed. 
Since no KeyValue was there, compaction deleted the old HFile and created a new 
one with minimumTimestamp = -1  maximumTimestamp = -1.
So after the first compaction, there were still 10 HFiles. It triggered 
compaction again and again.

2012-09-07 02:21:06,079 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the 
expired store file by compaction: 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/72468dff1cd94c4fb9cf9196cc3183b7
 whose maxTimeStamp is -1 while the max expired timestamp is 1344824466079

2012-09-07 02:21:06,080 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
Compacting 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/72468dff1cd94c4fb9cf9196cc3183b7,
 keycount=0, bloomtype=NONE, size=558, encoding=NONE
2012-09-07 02:21:06,082 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
file:hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/.tmp/8239019ff92f49bfab26b02ca43bc26a
 with permission:rwxrwxrwx


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6749) Compact one expired HFile all the time


 [ 
https://issues.apache.org/jira/browse/HBASE-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-6749:


Description: 
It's an interesting issue. We found there's 1 HFile keeped changing its name 
all the time. After dig in more, we found one strange behavior in compaction 
flow.

Here's the problem(We set the TTL property in our table):

There were 10 HFiles and only 1 expired HFile when this problem occured:
{noformat}
2012-09-07 02:21:05,298 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/221f56905cbd4bf09bd4d5d9dceb113a.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=118730, majorCompaction=false
2012-09-07 02:21:05,309 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/297b45a6c5f541dca05105ab098dab8d,
 isReference=false, isBulkLoadResult=false, seqid=122018, majorCompaction=false
2012-09-07 02:21:05,326 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/4a4a4598bc0443c9be087812052d6796.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=119850, majorCompaction=false
2012-09-07 02:21:05,348 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/8c6d56c9bafb4b0eb0dd6e04e41ca5b7,
 isReference=false, isBulkLoadResult=false, seqid=123135, majorCompaction=false
2012-09-07 02:21:05,357 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a6c2873a646c425f87a6a8a271a9904e,
 isReference=false, isBulkLoadResult=false, seqid=76561, majorCompaction=false
2012-09-07 02:21:05,370 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a712a2f79bd247f48405d2c6a91757ab.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=120951, majorCompaction=false
2012-09-07 02:21:05,381 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a717c63214534cb0aaf8e695147fde46,
 isReference=false, isBulkLoadResult=false, seqid=122763, majorCompaction=false
2012-09-07 02:21:05,431 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/c456f3831b094ac3a0590678acbf27a5.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=120579, majorCompaction=false
2012-09-07 02:21:05,518 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/dde4e56b131a4ffdaec8f9574bffa5ab.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=121651, majorCompaction=false
2012-09-07 02:21:05,593 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/ee62844594f2474e88186dbde673c802.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=119478, majorCompaction=false
{noformat} 
Compaction was triggered during Region-Opening . As we know, compaction should 
choose the expired HFiles to compact first, so that 1 expired HFile was 
choosed. 
Since no KeyValue was there, compaction deleted the old HFile and created a new 
one with minimumTimestamp = -1  maximumTimestamp = -1.
So after the first compaction, there were still 10 HFiles. It triggered 
compaction again and again.
{noformat}
2012-09-07 02:21:06,079 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the 
expired store file by compaction: 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/72468dff1cd94c4fb9cf9196cc3183b7
 whose maxTimeStamp is -1 while the max expired timestamp is 1344824466079

2012-09-07 02:21:06,080 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
Compacting 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/72468dff1cd94c4fb9cf9196cc3183b7,
 keycount=0, bloomtype=NONE, size=558, encoding=NONE
2012-09-07 02:21:06,082 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
file:hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/.tmp/8239019ff92f49bfab26b02ca43bc26a
 with permission:rwxrwxrwx
{noformat}

  was:
It's a interesting issue. We found there's 1 HFile keeped changing its name all 
the time. After dig in more, we found one strange behavior in compaction flow.

Here's the problem(We set the TTL property in our table):

There were 10 HFiles and only 1 expired HFile when this problem occured:
{noformat}
2012-09-07 02:21:05,298 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded

[jira] [Updated] (HBASE-6749) Compact one expired HFile all the time


 [ 
https://issues.apache.org/jira/browse/HBASE-6749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-6749:


Description: 
It's a interesting issue. We found there's 1 HFile keeped changing its name all 
the time. After dig in more, we found one strange behavior in compaction flow.

Here's the problem(We set the TTL property in our table):

There were 10 HFiles and only 1 expired HFile when this problem occured:
{noformat}
2012-09-07 02:21:05,298 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/221f56905cbd4bf09bd4d5d9dceb113a.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=118730, majorCompaction=false
2012-09-07 02:21:05,309 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/297b45a6c5f541dca05105ab098dab8d,
 isReference=false, isBulkLoadResult=false, seqid=122018, majorCompaction=false
2012-09-07 02:21:05,326 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/4a4a4598bc0443c9be087812052d6796.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=119850, majorCompaction=false
2012-09-07 02:21:05,348 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/8c6d56c9bafb4b0eb0dd6e04e41ca5b7,
 isReference=false, isBulkLoadResult=false, seqid=123135, majorCompaction=false
2012-09-07 02:21:05,357 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a6c2873a646c425f87a6a8a271a9904e,
 isReference=false, isBulkLoadResult=false, seqid=76561, majorCompaction=false
2012-09-07 02:21:05,370 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a712a2f79bd247f48405d2c6a91757ab.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=120951, majorCompaction=false
2012-09-07 02:21:05,381 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/a717c63214534cb0aaf8e695147fde46,
 isReference=false, isBulkLoadResult=false, seqid=122763, majorCompaction=false
2012-09-07 02:21:05,431 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/c456f3831b094ac3a0590678acbf27a5.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=120579, majorCompaction=false
2012-09-07 02:21:05,518 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/dde4e56b131a4ffdaec8f9574bffa5ab.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=121651, majorCompaction=false
2012-09-07 02:21:05,593 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/ee62844594f2474e88186dbde673c802.3ed2e43476ca2c614e33d0e1255c79a9,
 isReference=true, isBulkLoadResult=false, seqid=119478, majorCompaction=false
{noformat} 
Compaction was triggered during Region-Opening . As we know, compaction should 
choose the expired HFiles to compact first, so that 1 expired HFile was 
choosed. 
Since no KeyValue was there, compaction deleted the old HFile and created a new 
one with minimumTimestamp = -1  maximumTimestamp = -1.
So after the first compaction, there were still 10 HFiles. It triggered 
compaction again and again.
{noformat}
2012-09-07 02:21:06,079 INFO 
org.apache.hadoop.hbase.regionserver.compactions.CompactSelection: Deleting the 
expired store file by compaction: 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/72468dff1cd94c4fb9cf9196cc3183b7
 whose maxTimeStamp is -1 while the max expired timestamp is 1344824466079

2012-09-07 02:21:06,080 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
Compacting 
hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/value/72468dff1cd94c4fb9cf9196cc3183b7,
 keycount=0, bloomtype=NONE, size=558, encoding=NONE
2012-09-07 02:21:06,082 DEBUG org.apache.hadoop.hbase.util.FSUtils: Creating 
file:hdfs://hacluster/hbase/fhtest/97ac7ea6f732b7c10fe504a54ab02441/.tmp/8239019ff92f49bfab26b02ca43bc26a
 with permission:rwxrwxrwx
{noformat}

  was:
It's a interesting issue. We found there's 1 HFile keeped changing its name all 
the time. After dig in more, we found one strange behavior in compaction flow.

Here's the problem(We set the TTL property in our table):

There were 10 HFiles and only 1 expired HFile when this problem occured:

2012-09-07 02:21:05,298 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
loaded

[jira] [Commented] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit


[ 
https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451950#comment-13451950
 ] 

Michael Drzal commented on HBASE-6429:
--

[~zhi...@ebaysf.com] can we close this out?

 Filter with filterRow() returning true is incompatible with scan with limit
 ---

 Key: HBASE-6429
 URL: https://issues.apache.org/jira/browse/HBASE-6429
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.96.0
Reporter: Jason Dai
Assignee: Jie Huang
 Fix For: 0.96.0

 Attachments: hbase-6429_0_94_0.patch, hbase-6429-trunk.patch, 
 hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, 
 hbase-6429-trunk-v4.patch


 Currently if we scan with bot limit and a Filter with 
 filterRow(ListKeyValue) implemented, an  IncompatibleFilterException will 
 be thrown. The same exception should also be thrown if the filer has its 
 filterRow() implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6430) Few modifications in section 2.4.2.1 of Apache HBase Reference Guide


 [ 
https://issues.apache.org/jira/browse/HBASE-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Drzal updated HBASE-6430:
-

Labels: noob  (was: )

 Few modifications in section 2.4.2.1 of Apache HBase Reference Guide
 

 Key: HBASE-6430
 URL: https://issues.apache.org/jira/browse/HBASE-6430
 Project: HBase
  Issue Type: Improvement
Reporter: Mohammad Tariq Iqbal
Priority: Minor
  Labels: noob
 Attachments: HBASE-6430.txt


 Quite often, newbies face some issues while configuring Hbase in pseudo 
 distributed mode. I was no exception. I would like to propose some solutions 
 for these problems which worked for me. If the community finds it 
 appropriate, I would like to apply the patch for the same. This is the first 
 time I am trying to do something like this, so please pardon me if I have put 
 it in an appropriate manner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6750) Provide a variant of ValueFilter that only accepts the latest value (like SingleColumnValueFilter.setLatestVersionOnly)

2012-09-10 Thread David Witten (JIRA)

David Witten created HBASE-6750:
---

 Summary: Provide a variant of ValueFilter that only accepts the 
latest value (like SingleColumnValueFilter.setLatestVersionOnly)
 Key: HBASE-6750
 URL: https://issues.apache.org/jira/browse/HBASE-6750
 Project: HBase
  Issue Type: New Feature
  Components: filters
Affects Versions: 0.90.5
 Environment: All
Reporter: David Witten


Currently ValueFilter will return an old value that matches if the latest value 
does not.  I recommend providing an option on ValueFilter, like 
setLastestVersionOnly, or creating a subclass of ValueFilter that always has 
this behavior.  

Below is a custom filter that seems to work, though you may want to copy and 
frob ValueFilter to just return NEXT_COL where it returns SKIP:


package dummy.hbasesvr;

import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.filter.ValueFilter;
import org.apache.hadoop.hbase.filter.WritableByteArrayComparable;

/**
 * The same as {@link ValueFilter} except it will only look at the latest value 
for a given column.
 */
public class LatestValueFilter extends ValueFilter
{
/**
 * Writable constructor, do not use.
 */
public LatestValueFilter()
{
}

/**
 * Constructor.
 * @param valueCompareOp the compare op for value matching
 * @param valueComparator the comparator for value matching
 */
public LatestValueFilter(CompareOp valueCompareOp, 
WritableByteArrayComparable valueComparator)
{
super(valueCompareOp, valueComparator);
}


@Override
public ReturnCode filterKeyValue( KeyValue v)
{
// This assumes that given several KeyValues with the same 
row+fam+qual+val the one with
// the latest value will be given first.
ReturnCode superReturnCode = super.filterKeyValue(v);
if ( superReturnCode == ReturnCode.SKIP)
{
return ReturnCode.NEXT_COL;
}
return superReturnCode;
}
}

Note I am a novice HBase user.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6431) Some FilterList Constructors break addFilter


[ 
https://issues.apache.org/jira/browse/HBASE-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451956#comment-13451956
 ] 

Michael Drzal commented on HBASE-6431:
--

[~zhi...@ebaysf.com] can we close this out?

 Some FilterList Constructors break addFilter
 

 Key: HBASE-6431
 URL: https://issues.apache.org/jira/browse/HBASE-6431
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.92.1, 0.94.0
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6431.-Some-FilterList-Constructors-break-addFi.patch, 6431-v2.txt


 Some of the constructors for FilterList set the internal list of filters to 
 list types which don't support the add operation. As a result 
 FilterList(final ListFilter rowFilters)
 FilterList(final Filter... rowFilters)
 FilterList(final Operator operator, final ListFilter rowFilters)
 FilterList(final Operator operator, final Filter... rowFilters)
 may init private ListFilter filters = new ArrayListFilter(); incorrectly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6441) MasterFS doesn't set scheme for internal FileSystem


[ 
https://issues.apache.org/jira/browse/HBASE-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451964#comment-13451964
 ] 

Michael Drzal commented on HBASE-6441:
--

[~apurtell] and [~jesse_yates] any consensus on 0.94?

 MasterFS doesn't set scheme for internal FileSystem
 ---

 Key: HBASE-6441
 URL: https://issues.apache.org/jira/browse/HBASE-6441
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0, 0.94.2
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: java_HBASE-6441_v0.patch


 FSUtils.getRootDir() just takes a configuration object, which is used to:
 1) Get the name of the root directory
 2) Create a filesystem (based on the configured scheme)
 3) Qualify the root onto the filesystem
 However, the FileSystem from the master filesystem won't generate the 
 correctly qualified root directory under hadoop-2.0 (though it works fine on 
 hadoop-1.0). Seems to be an issue with the configuration parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6431) Some FilterList Constructors break addFilter


 [ 
https://issues.apache.org/jira/browse/HBASE-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6431:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Some FilterList Constructors break addFilter
 

 Key: HBASE-6431
 URL: https://issues.apache.org/jira/browse/HBASE-6431
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.92.1, 0.94.0
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6431.-Some-FilterList-Constructors-break-addFi.patch, 6431-v2.txt


 Some of the constructors for FilterList set the internal list of filters to 
 list types which don't support the add operation. As a result 
 FilterList(final ListFilter rowFilters)
 FilterList(final Filter... rowFilters)
 FilterList(final Operator operator, final ListFilter rowFilters)
 FilterList(final Operator operator, final Filter... rowFilters)
 may init private ListFilter filters = new ArrayListFilter(); incorrectly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6431) Some FilterList Constructors break addFilter


 [ 
https://issues.apache.org/jira/browse/HBASE-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6431:
--

Hadoop Flags: Reviewed

 Some FilterList Constructors break addFilter
 

 Key: HBASE-6431
 URL: https://issues.apache.org/jira/browse/HBASE-6431
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.92.1, 0.94.0
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6431.-Some-FilterList-Constructors-break-addFi.patch, 6431-v2.txt


 Some of the constructors for FilterList set the internal list of filters to 
 list types which don't support the add operation. As a result 
 FilterList(final ListFilter rowFilters)
 FilterList(final Filter... rowFilters)
 FilterList(final Operator operator, final ListFilter rowFilters)
 FilterList(final Operator operator, final Filter... rowFilters)
 may init private ListFilter filters = new ArrayListFilter(); incorrectly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6429) Filter with filterRow() returning true is incompatible with scan with limit


 [ 
https://issues.apache.org/jira/browse/HBASE-6429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6429:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Filter with filterRow() returning true is incompatible with scan with limit
 ---

 Key: HBASE-6429
 URL: https://issues.apache.org/jira/browse/HBASE-6429
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.96.0
Reporter: Jason Dai
Assignee: Jie Huang
 Fix For: 0.96.0

 Attachments: hbase-6429_0_94_0.patch, hbase-6429-trunk.patch, 
 hbase-6429-trunk-v2.patch, hbase-6429-trunk-v3.patch, 
 hbase-6429-trunk-v4.patch


 Currently if we scan with bot limit and a Filter with 
 filterRow(ListKeyValue) implemented, an  IncompatibleFilterException will 
 be thrown. The same exception should also be thrown if the filer has its 
 filterRow() implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6419) PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 of HBASE-6220)


 [ 
https://issues.apache.org/jira/browse/HBASE-6419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6419:
--

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

 PersistentMetricsTimeVaryingRate gets used for non-time-based metrics (part2 
 of HBASE-6220)
 ---

 Key: HBASE-6419
 URL: https://issues.apache.org/jira/browse/HBASE-6419
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Paul Cavallaro
 Fix For: 0.96.0

 Attachments: ServerMetrics_HBASE_6220_Flush_Metrics.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451996#comment-13451996
 ] 

Ted Yu commented on HBASE-6698:
---

From 
https://builds.apache.org/job/PreCommit-HBASE-Build/2837//testReport/org.apache.hadoop.hbase.master/TestAssignmentManager/testShutdownHandler/:
{code}
org.mockito.exceptions.misusing.WrongTypeOfReturnValue: 
CatalogTracker$$EnhancerByMockitoWithCGLIB$$5da9aeb6 cannot be returned by 
isStopped()
isStopped() should return boolean
{code}

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6748) Endless recursive of deleteNode happened in SplitLogManager#DeleteAsyncCallback


 [ 
https://issues.apache.org/jira/browse/HBASE-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6748:
--

Description: 
You can ealily understand the problem from the below logs:
{code}
[2012-09-01 11:41:02,062] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=3
[2012-09-01 11:41:02,062] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=2
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=1
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=0
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager 393] failed to create task 
node/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager 353] Error splitting 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
[2012-09-01 11:41:02,063] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775807
[2012-09-01 11:41:02,064] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775806
[2012-09-01 11:41:02,064] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775805
[2012-09-01 11:41:02,064] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775804
[2012-09-01 11:41:02,065] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
delete rc=SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=9223372036854775803
...
[2012-09-01 11:41:03,307] [ERROR] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.zookeeper.ClientCnxn 623] Caught unexpected throwable
java.lang.StackOverflowError
{code}

  was:
You can ealily understand the problem from the below logs:
[2012-09-01 11:41:02,062] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
[org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] create 
rc =SESSIONEXPIRED for 
/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 remaining retries=3
[2012-09-01 11:41:02,062] [WARN ] 
[MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1]

[jira] [Created] (HBASE-6751) Too many retries, leading a a delay to read the HLog after a datanode failure

nkeywal created HBASE-6751:
--

 Summary: Too many retries, leading a a delay to read the HLog 
after a datanode failure
 Key: HBASE-6751
 URL: https://issues.apache.org/jira/browse/HBASE-6751
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.94.0, 0.96.0
Reporter: nkeywal


When reading an HLog, we need to got to the last block to get the file size.
In HDFS 1.0.3, it leads to HDFS-3701 / HBASE-6401

In HDFS branch-2, this bug is fixed; but we have two other issues.

1) For simple cases as a single node died, we don't have the effect of 
HDFS-3703, and the default location order leads us to try to connect to a dead 
datanode while we should not. This is not analysed yet. A specific JIRA will be 
created later.
2) If we are redirected to a wrong node, we experience a huge delay:

The pseudo code in DFSInputStream#readBlockLength is:

{noformat}   
for(DatanodeInfo datanode : locatedblock.getLocations()) { 
  try {
ClientDatanodeProtocol cdp = DFSUtil.createClientDatanodeProtocolProxy(
datanode, dfsClient.conf, dfsClient.getConf().socketTimeout,
dfsClient.getConf().connectToDnViaHostname, locatedblock);
   
   return cdp.getReplicaVisibleLength(locatedblock.getBlock());

} catch {
 // retry
  }
}
{noformat} 

However, with this code, the connection is created with a null RetryPolicy. 
It's then defaulted to 10 retries, with:

{noformat} 
  public static final String  IPC_CLIENT_CONNECT_MAX_RETRIES_KEY = 
ipc.client.connect.max.retries;
  public static final int IPC_CLIENT_CONNECT_MAX_RETRIES_DEFAULT = 10;
{noformat} 

So if the first datanode is bad, we will try 10 times before trying the second. 
In the context of HBASE-6738, the split task is cancelled before we're have 
opened the file to split.

By nature, it's likely to be a pure HDFS issue. But may be it can be solved in 
HBase with the right configuration on ipc.client.connect.max.retries.

The ideal fix (in HDFS) would be to try the datanodes once each, and then loop 
10 times.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6752) On region server failure, serves writes and timeranged reads during the log split.

nkeywal created HBASE-6752:
--

 Summary: On region server failure, serves writes and timeranged 
reads during the log split.
 Key: HBASE-6752
 URL: https://issues.apache.org/jira/browse/HBASE-6752
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor


Opening for write on failure would mean:
- Assign the region to a new regionserver. It marks the region as recovering
  -- specific exception returned to the client when we cannot server.
  -- allow them to know where they stand. The exception can include some time 
information (failure stated on: ...)
  -- allow them to go immediately on the right regionserver, instead of 
retrying or calling the region holding meta to get the new address
 = save network calls, lower the load on meta.
- Do the split as today. Priority is given to region server holding the new 
regions
  -- help to share the load balancing code: the split is done by region server 
considered as available for new regions
  -- help locality (the recovered edits are available on the region server) = 
lower the network usage
- When the split is finished, we're done as of today
- while the split is progressing, the region server can
 -- serve writes
   --- that's useful for all application that need to write but not read 
immediately:
   --- whatever logs events to analyze them later
   --- opentsdb is a perfect example.   
 -- serve reads if they have a compatible time range. For heavily used tables, 
it could be an help, because:
   --- we can expect to have a few minutes of data only (as it's loaded)
   --- the heaviest queries, often accepts a few -or more- minutes delay. 

Some What if:
1) the split fails
= Retry until it works. As today. Just that we serves writes. We need to know 
(as today) that the region has not recovered if we fail again.
2) the regionserver fails during the split
= As 1 and as of today/
3) the regionserver fails after the split but before the state change to fully 
available.
= New assign. More logs to split (the ones already dones and the new ones).
4) the assignment fails
= Retry until it works. As today.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6748) Endless recursive of deleteNode happened in SplitLogManager#DeleteAsyncCallback


[ 
https://issues.apache.org/jira/browse/HBASE-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452053#comment-13452053
 ] 

Ted Yu commented on HBASE-6748:
---

@Jieshan:
Can you tell us under what circumstance the above recursion happened ?

 Endless recursive of deleteNode happened in 
 SplitLogManager#DeleteAsyncCallback
 ---

 Key: HBASE-6748
 URL: https://issues.apache.org/jira/browse/HBASE-6748
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0, 0.94.1
Reporter: Jieshan Bean
Priority: Critical

 You can ealily understand the problem from the below logs:
 {code}
 [2012-09-01 11:41:02,062] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=3
 [2012-09-01 11:41:02,062] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=2
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=1
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=0
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager 393] failed to create task 
 node/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager 353] Error splitting 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775807
 [2012-09-01 11:41:02,064] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775806
 [2012-09-01 11:41:02,064] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775805
 [2012-09-01 11:41:02,064] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775804
 [2012-09-01 11:41:02,065] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775803
 ...
 [2012-09-01 11:41:03,307] [ERROR]

[jira] [Created] (HBASE-6753) Potential bug in Put object toString()

2012-09-10 Thread Walter Tietze (JIRA)

Walter Tietze created HBASE-6753:


 Summary: Potential bug in Put object toString()
 Key: HBASE-6753
 URL: https://issues.apache.org/jira/browse/HBASE-6753
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
 Environment: Cloudera CDH 4.0.1 with hbase 0.92.1-cdh4.0.1 
Reporter: Walter Tietze
Priority: Minor


I'm a newbie to HBase.

I implemented a coprocessor which is pretty nice with Cloudera version 4.0.1.


Testing my copressor evolved a problem, because everytime I inserted logging 
into my prePut-method, the Put object was not stored anymore into HBase.

I analyzed the code and could reduce the problem to the fact, that calling the 
toString-method on the Put object alone, is the reason for this behaviour.

There seems to be a problem with the serialization of the object.

Serialization seems to modifiy the object with the result, that it is not 
inserted in HBase anymore.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6752) On region server failure, serve writes and timeranged reads during the log split


 [ 
https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6752:
--

Summary: On region server failure, serve writes and timeranged reads during 
the log split  (was: On region server failure, serves writes and timeranged 
reads during the log split.)

 On region server failure, serve writes and timeranged reads during the log 
 split
 

 Key: HBASE-6752
 URL: https://issues.apache.org/jira/browse/HBASE-6752
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 Opening for write on failure would mean:
 - Assign the region to a new regionserver. It marks the region as recovering
   -- specific exception returned to the client when we cannot server.
   -- allow them to know where they stand. The exception can include some time 
 information (failure stated on: ...)
   -- allow them to go immediately on the right regionserver, instead of 
 retrying or calling the region holding meta to get the new address
  = save network calls, lower the load on meta.
 - Do the split as today. Priority is given to region server holding the new 
 regions
   -- help to share the load balancing code: the split is done by region 
 server considered as available for new regions
   -- help locality (the recovered edits are available on the region server) 
 = lower the network usage
 - When the split is finished, we're done as of today
 - while the split is progressing, the region server can
  -- serve writes
--- that's useful for all application that need to write but not read 
 immediately:
--- whatever logs events to analyze them later
--- opentsdb is a perfect example.   
  -- serve reads if they have a compatible time range. For heavily used 
 tables, it could be an help, because:
--- we can expect to have a few minutes of data only (as it's loaded)
--- the heaviest queries, often accepts a few -or more- minutes delay. 
 Some What if:
 1) the split fails
 = Retry until it works. As today. Just that we serves writes. We need to 
 know (as today) that the region has not recovered if we fail again.
 2) the regionserver fails during the split
 = As 1 and as of today/
 3) the regionserver fails after the split but before the state change to 
 fully available.
 = New assign. More logs to split (the ones already dones and the new ones).
 4) the assignment fails
 = Retry until it works. As today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6753) Potential bug in Put object toString()


[ 
https://issues.apache.org/jira/browse/HBASE-6753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452114#comment-13452114
 ] 

Ted Yu commented on HBASE-6753:
---

In Operation.toString(), I see:
{code}
try {
  return toJSON(maxCols);
} catch (IOException ioe) {
  return toMap(maxCols).toString();
}
{code}
Is jersey-json library on the classpath ?

If you can provide more log, that would help us analyze the problem.

 Potential bug in Put object toString()
 --

 Key: HBASE-6753
 URL: https://issues.apache.org/jira/browse/HBASE-6753
 Project: HBase
  Issue Type: Bug
  Components: coprocessors
 Environment: Cloudera CDH 4.0.1 with hbase 0.92.1-cdh4.0.1 
Reporter: Walter Tietze
Priority: Minor

 I'm a newbie to HBase.
 I implemented a coprocessor which is pretty nice with Cloudera version 4.0.1.
 Testing my copressor evolved a problem, because everytime I inserted logging 
 into my prePut-method, the Put object was not stored anymore into HBase.
 I analyzed the code and could reduce the problem to the fact, that calling 
 the toString-method on the Put object alone, is the reason for this behaviour.
 There seems to be a problem with the serialization of the object.
 Serialization seems to modifiy the object with the result, that it is not 
 inserted in HBase anymore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6630) Port HBASE-6590 to trunk : Assign sequence number to bulk loaded files

[
https://issues.apache.org/jira/browse/HBASE-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ted Yu updated HBASE-6630:
--

Resolution: Fixed
Hadoop Flags: Reviewed
Status: Resolved (was: Patch Available)

Port HBASE-6590 to trunk : Assign sequence number to bulk loaded files
--

Key: HBASE-6630
URL: https://issues.apache.org/jira/browse/HBASE-6630
Project: HBase
Issue Type: Sub-task
Affects Versions: 0.94.1
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
Fix For: 0.96.0

Attachments: 6590-seq-id-bulk-load.txt, 6630-v2.txt

Currently bulk loaded files are not assigned a sequence number. Thus, they
can only be used to import historical data, dating to the past. There are
cases where we want to bulk load current data; but the bulk load mechanism
does not support this, as the bulk loaded files are always sorted behind the
non-bulkloaded hfiles. Assigning Sequence Id to bulk loaded files should
solve this issue.
StoreFiles within a store are sorted based on the sequenceId. SequenceId is a
monotonically increasing number that accompanies every edit written to the
WAL. For entries that update the same cell, we would like the latter edit to
win. This comparision is accomplished using memstoreTS, at the KV level; and
sequenceId at the StoreFile level (to order scanners in the KeyValueHeap).
BulkLoaded files are generated outside of HBase/RegionServer, so they do not
have a sequenceId written in the file. This causes HBase to lose track of the
point in time, when the BulkLoaded file was imported to HBase. Resulting in a
behavior, that *only* supports viewing bulkLoaded files as files back-filling
data from the begining of time.
By assigning a sequence number to the file, we can allow the bulk loaded file
to fit in where we want. Either at the current time or the begining of
time. The latter is the default, to maintain backward compatibility.
Design approach:
Store files keep track of the sequence Id in the trailer. Since we do not
wish to edit/rewrite the bulk loaded file upon import, we will encode the
assigned sequenceId into the fileName. The filename RegEx is updated for this
regard. If the sequenceId is encoded in the filename, the sequenceId will be
used as the sequenceId for the file. If none is found, the sequenceId will be
considered 0 (as per the default, backward-compatible behavior).
To enable clients to request pre-existing behavior, the command line utility
allows for 2 ways to import BulkLoaded Files: to assign or not assign a
sequence Number.
If a sequence Number is assigned, the imporeted file will be imported
with the current sequence Id.
if the sequence Number is not assigned, it will be as if it was
backfilling old data, from the begining of time.
Compaction behavior:
With the current compaction algorithm, bulk loaded files – that backfill
data, to the begining of time – can cause a compaction storm, converting
every minor compaction to a major compaction. To address this, these files
are excluded from minor compaction, based on a config param. (enabled for the
messages use case).
Since, bulk loaded files that are not back-filling data do not cause this
issue, they will not be ignored during minor compactions based on the config
parameter. This is also required to ensure that there are no holes in the set
of files selected for compaction – this is necessary to preserve the order of
KV's comparision before and after compaction.

[jira] [Assigned] (HBASE-6745) Always flush region based on memstore size could hold hlog files from archiving

2012-09-10 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HBASE-6745:
--

Assignee: Jimmy Xiang

 Always flush region based on memstore size could hold hlog files from 
 archiving
 ---

 Key: HBASE-6745
 URL: https://issues.apache.org/jira/browse/HBASE-6745
 Project: HBase
  Issue Type: Bug
  Components: replication, wal
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

 Currently, memstore flusher always chooses the biggest memstore region to 
 flush.
 Suppose I have two tables: one is very actively updated, while the other is 
 periodically updated. The active one has biggest memstore all the time and is 
 flushed all the time.  But the in-active one never gets a chance to flush.  
 Since it is not flushed, the hlog file can't be archived, although there are 
 lots of hlog files.
 If the active table happens to have big updates all the time, the hlog files 
 could cause huge disk space pressure.
 Other than the memstore size, periodically flushing regions based on hlog 
 roll time is helpful in hlog archiving/replication.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-6745) Always flush region based on memstore size could hold hlog files from archiving

2012-09-10 Thread Jimmy Xiang (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jimmy Xiang resolved HBASE-6745.

Resolution: Invalid

Looked into the code, which looks fine to me. After each log roll, proper
regions are scheduled to flush automatically. Biggest region is flushed only
under global memory pressure.

Always flush region based on memstore size could hold hlog files from
archiving
---

Key: HBASE-6745
URL: https://issues.apache.org/jira/browse/HBASE-6745
Project: HBase
Issue Type: Bug
Components: replication, wal
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang

Currently, memstore flusher always chooses the biggest memstore region to
flush.
Suppose I have two tables: one is very actively updated, while the other is
periodically updated. The active one has biggest memstore all the time and is
flushed all the time. But the in-active one never gets a chance to flush.
Since it is not flushed, the hlog file can't be archived, although there are
lots of hlog files.
If the active table happens to have big updates all the time, the hlog files
could cause huge disk space pressure.
Other than the memstore size, periodically flushing regions based on hlog
roll time is helpful in hlog archiving/replication.

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-10 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452148#comment-13452148
 ] 

ramkrishna.s.vasudevan commented on HBASE-6698:
---

I tried running all the failed testcases.  All of them passed.
{code}
2012-09-10 22:38:21,013 INFO  [main] hbase.HBaseTestingUtility(288): Created 
new mini-cluster data directory: 
D:\HBase\Trunk\hbase-server\target\test-data\1b34c2bc-9ae2-4c2c-a17e-a7a7ef73b1b6\dfscluster_1f11df5c-39e2-47e7-bf7b-634dbebf785a
2012-09-10 22:38:21,540 INFO  [main] zookeeper.MiniZooKeeperCluster(196): 
Started MiniZK Cluster and connect 1 ZK server on client port: 50620
2012-09-10 22:38:21,549 ERROR [main] hbase.ResourceChecker(129): Bad 
configuration: the operating systems file handles maximum is 0 our is 1024
2012-09-10 22:38:21,971 INFO  [main] hbase.ResourceChecker(144): before 
master.TestAssignmentManager#testShutdownHandler: 10 threads, 0 file 
descriptors 0 connections, 
2012-09-10 22:38:22,371 DEBUG [main] zookeeper.ZKUtil(102): mockedServer 
opening connection to ZooKeeper with ensemble (localhost:50620)
2012-09-10 22:38:22,435 INFO  [main] zookeeper.RecoverableZooKeeper(101): The 
identifier of this process is 4624@Ram
2012-09-10 22:38:22,569 DEBUG [main-EventThread] 
zookeeper.ZooKeeperWatcher(261): mockedServer Received ZooKeeper Event, 
type=None, state=SyncConnected, path=null
2012-09-10 22:38:22,580 DEBUG [main-EventThread] 
zookeeper.ZooKeeperWatcher(338): mockedServer-0x139b127390f connected
2012-09-10 22:38:24,517 DEBUG [main] executor.ExecutorService(132): Starting 
executor service name=MASTER_OPEN_REGION-testShutdownHandler, corePoolSize=3, 
maxPoolSize=3
2012-09-10 22:38:24,518 DEBUG [main] executor.ExecutorService(132): Starting 
executor service name=MASTER_CLOSE_REGION-testShutdownHandler, corePoolSize=3, 
maxPoolSize=3
2012-09-10 22:38:24,518 DEBUG [main] executor.ExecutorService(132): Starting 
executor service name=MASTER_SERVER_OPERATIONS-testShutdownHandler, 
corePoolSize=3, maxPoolSize=3
2012-09-10 22:38:24,518 DEBUG [main] executor.ExecutorService(132): Starting 
executor service name=MASTER_META_SERVER_OPERATIONS-testShutdownHandler, 
corePoolSize=3, maxPoolSize=3
2012-09-10 22:38:26,200 INFO  [main] handler.ServerShutdownHandler(181): 
Skipping log splitting for example.org,1234,5678
2012-09-10 22:38:26,461 DEBUG [main] client.ClientScanner(94): Creating scanner 
over .META. starting at key ''
2012-09-10 22:38:26,461 DEBUG [main] client.ClientScanner(205): Advancing 
internal scanner to startKey at ''
2012-09-10 22:38:26,781 DEBUG [main] client.ClientScanner(192): Finished with 
scanning at {NAME = 't,,1347296900866.db9424ce7e14acb58b9420b098b996ea.', 
STARTKEY = '', ENDKEY = '', ENCODED = db9424ce7e14acb58b9420b098b996ea,}
2012-09-10 22:38:26,801 INFO  [main] handler.ServerShutdownHandler(282): 
Reassigning 1 region(s) that example.org,1234,5678 was carrying (skipping 0 
regions(s) that are already in transition)
2012-09-10 22:38:26,801 INFO  [main] handler.ServerShutdownHandler(378): The 
table t was deleted.  Hence not proceeding.
2012-09-10 22:38:26,801 INFO  [main] master.AssignmentManager(1372): Quickly 
assigning 0 region(s) across 2 server(s)
2012-09-10 22:38:26,801 INFO  [main] master.AssignmentManager(1377): Failed 
getting bulk plan, assigning region singly
2012-09-10 22:38:26,801 INFO  [main] handler.ServerShutdownHandler(359): 
Finished processing of shutdown of example.org,1234,5678
2012-09-10 22:38:26,801 DEBUG [main] zookeeper.ZKAssign(538): 
mockedServer-0x139b127390f Deleting any existing unassigned nodes
2012-09-10 22:38:26,832 DEBUG [main] zookeeper.ZKAssign(538): 
mockedServer-0x139b127390f Deleting any existing unassigned nodes
2012-09-10 22:38:26,853 INFO  [main] hbase.ResourceChecker(144): after 
master.TestAssignmentManager#testShutdownHandler: 12 threads (was 10), 0 file 
descriptors 1 connections,  -thread leak?- 
2012-09-10 22:38:27,863 INFO  [main] zookeeper.MiniZooKeeperCluster(238): 
Shutdown MiniZK cluster with all ZK servers

{code}
Also could not find the error that occured while running testShutDownHandler.  
The code is the updated one? Am i missing something here?

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future

[jira] [Commented] (HBASE-6725) Check and Put can fail when using locks

2012-09-10 Thread JIRA

[
https://issues.apache.org/jira/browse/HBASE-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452235#comment-13452235
]

Nicolas Thiébaud commented on HBASE-6725:
-

The bug comes from the fact that checkAndMutate uses getLock(lockId,
get.getRow(), true) and resuses the provided lock. Since several client threads
use the same lock many are allowed to mutate.

I believe there are 2 solutions:
a. Use some sort of internal lock, which requires a specific lock set for check
and mutate
b. Consider CAP with locks as bad usage of locks and deprecate the feature.

I'm willing to contribute for this patch, but I'm not sure how to go on from
now.

Check and Put can fail when using locks
---

Key: HBASE-6725
URL: https://issues.apache.org/jira/browse/HBASE-6725
Project: HBase
Issue Type: Bug
Components: master
Affects Versions: 0.90.4
Environment: tested on mac os x
Reporter: Nicolas Thiébaud
Attachments: CAPwithLocks.zip, TestCase_HBASE_6725.patch

When multiple threads race using CAP with and a lock on the same row, several
instances may be allowed to update the cell with the new value (although the
expected value is different).
If all threads race with a wrong expected value and a lock, none will be able
to update.

[jira] [Commented] (HBASE-6725) Check and Put can fail when using locks


[ 
https://issues.apache.org/jira/browse/HBASE-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452246#comment-13452246
 ] 

Ted Yu commented on HBASE-6725:
---

@Nicolas:
Thanks for the finding.

For point b, can you write a summary on dev@ list to poll people's opinion ?

 Check and Put can fail when using locks
 ---

 Key: HBASE-6725
 URL: https://issues.apache.org/jira/browse/HBASE-6725
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
 Environment: tested on mac os x
Reporter: Nicolas Thiébaud
 Attachments: CAPwithLocks.zip, TestCase_HBASE_6725.patch


 When multiple threads race using CAP with and a lock on the same row, several 
 instances may be allowed to update the cell with the new value (although the 
 expected value is different).
 If all threads race with a wrong expected value and a lock, none will be able 
 to update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

[
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452248#comment-13452248
]

stack commented on HBASE-6698:
--

I ran the tests individually and they passed for me. Let me retry the patch.

Rather than '+Delete delete = new Delete(new byte[0]);', could pass
HConstants.EMPTY_BYTE_ARRAY (I could add on commit)... thats minor. Let me do
a more extensive review. This patch is great. It could be too good to be true.
I just want to check

Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
--

Key: HBASE-6698
URL: https://issues.apache.org/jira/browse/HBASE-6698
Project: HBase
Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Fix For: 0.96.0

Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch,
HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch,
HBASE-6698_6.patch, HBASE-6698.patch

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Status: Patch Available  (was: Open)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Status: Open  (was: Patch Available)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Attachment: HBASE-6698_6.patch

Retry

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6441) MasterFS doesn't set scheme for internal FileSystem

2012-09-10 Thread Jesse Yates (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452259#comment-13452259
 ] 

Jesse Yates commented on HBASE-6441:


I'm good with it going into 0.94.3 - I don't think this has been fixed yet 
(though there was some work Hsieh was doing around this, IIRC)

 MasterFS doesn't set scheme for internal FileSystem
 ---

 Key: HBASE-6441
 URL: https://issues.apache.org/jira/browse/HBASE-6441
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0, 0.94.2
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.96.0

 Attachments: java_HBASE-6441_v0.patch


 FSUtils.getRootDir() just takes a configuration object, which is used to:
 1) Get the name of the root directory
 2) Create a filesystem (based on the configured scheme)
 3) Qualify the root onto the filesystem
 However, the FileSystem from the master filesystem won't generate the 
 correctly qualified root directory under hadoop-2.0 (though it works fine on 
 hadoop-1.0). Seems to be an issue with the configuration parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6431) Some FilterList Constructors break addFilter

2012-09-10 Thread Alex Newman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452273#comment-13452273
 ] 

Alex Newman commented on HBASE-6431:


Fine by me

 Some FilterList Constructors break addFilter
 

 Key: HBASE-6431
 URL: https://issues.apache.org/jira/browse/HBASE-6431
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.92.1, 0.94.0
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6431.-Some-FilterList-Constructors-break-addFi.patch, 6431-v2.txt


 Some of the constructors for FilterList set the internal list of filters to 
 list types which don't support the add operation. As a result 
 FilterList(final ListFilter rowFilters)
 FilterList(final Filter... rowFilters)
 FilterList(final Operator operator, final ListFilter rowFilters)
 FilterList(final Operator operator, final Filter... rowFilters)
 may init private ListFilter filters = new ArrayListFilter(); incorrectly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable


 [ 
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-6658:
--

Attachment: HBASE-6658-v4.patch

Hmm...builds for me with -Dhadoop.profile=2.0.

Attaching same patch rebased against trunk.

 Rename WritableByteArrayComparable to something not mentioning Writable
 ---

 Key: HBASE-6658
 URL: https://issues.apache.org/jira/browse/HBASE-6658
 Project: HBase
  Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6658.patch, HBASE-6658-v3.patch, 
 HBASE-6658-v4.patch


 After HBASE-6477, WritableByteArrayComparable will no longer be Writable, so 
 should be renamed.
 Current idea is ByteArrayComparator (since all the derived classes are 
 *Comparator not *Comparable), but I'm open to suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

[
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452325#comment-13452325
]

Hadoop QA commented on HBASE-6698:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544495/HBASE-6698_6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2838//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2838//console

This message is automatically generated.

Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
--

Key: HBASE-6698
URL: https://issues.apache.org/jira/browse/HBASE-6698
Project: HBase
Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Fix For: 0.96.0

Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch,
HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch,
HBASE-6698_6.patch, HBASE-6698.patch

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Status: Open  (was: Patch Available)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Attachment: HBASE-6698_6.patch

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Status: Patch Available  (was: Open)

try again...

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (HBASE-6431) Some FilterList Constructors break addFilter


 [ 
https://issues.apache.org/jira/browse/HBASE-6431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack closed HBASE-6431.



Committed a good while back.  Resolving.

 Some FilterList Constructors break addFilter
 

 Key: HBASE-6431
 URL: https://issues.apache.org/jira/browse/HBASE-6431
 Project: HBase
  Issue Type: Bug
  Components: filters
Affects Versions: 0.92.1, 0.94.0
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor
 Fix For: 0.96.0

 Attachments: 
 0001-HBASE-6431.-Some-FilterList-Constructors-break-addFi.patch, 6431-v2.txt


 Some of the constructors for FilterList set the internal list of filters to 
 list types which don't support the add operation. As a result 
 FilterList(final ListFilter rowFilters)
 FilterList(final Filter... rowFilters)
 FilterList(final Operator operator, final ListFilter rowFilters)
 FilterList(final Operator operator, final Filter... rowFilters)
 may init private ListFilter filters = new ArrayListFilter(); incorrectly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable


[ 
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452359#comment-13452359
 ] 

Gregory Chanan commented on HBASE-6658:
---

https://builds.apache.org/job/PreCommit-HBASE-Build/2839/

Failed again, I'll take a closer look.

 Rename WritableByteArrayComparable to something not mentioning Writable
 ---

 Key: HBASE-6658
 URL: https://issues.apache.org/jira/browse/HBASE-6658
 Project: HBase
  Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6658.patch, HBASE-6658-v3.patch, 
 HBASE-6658-v4.patch


 After HBASE-6477, WritableByteArrayComparable will no longer be Writable, so 
 should be renamed.
 Current idea is ByteArrayComparator (since all the derived classes are 
 *Comparator not *Comparable), but I'm open to suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable


 [ 
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-6658:
--

Attachment: HBASE-6658-v5.patch

Forgot to add a file, this should work.

 Rename WritableByteArrayComparable to something not mentioning Writable
 ---

 Key: HBASE-6658
 URL: https://issues.apache.org/jira/browse/HBASE-6658
 Project: HBase
  Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6658.patch, HBASE-6658-v3.patch, 
 HBASE-6658-v4.patch, HBASE-6658-v5.patch


 After HBASE-6477, WritableByteArrayComparable will no longer be Writable, so 
 should be renamed.
 Current idea is ByteArrayComparator (since all the derived classes are 
 *Comparator not *Comparable), but I'm open to suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable


 [ 
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan updated HBASE-6658:
--

Attachment: HBASE-6658-v6.patch

* Attached HBASE-6658-v6.patch *

Looks like previous patch had slightly old generated proto files, reattaching.

 Rename WritableByteArrayComparable to something not mentioning Writable
 ---

 Key: HBASE-6658
 URL: https://issues.apache.org/jira/browse/HBASE-6658
 Project: HBase
  Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6658.patch, HBASE-6658-v3.patch, 
 HBASE-6658-v4.patch, HBASE-6658-v5.patch, HBASE-6658-v6.patch


 After HBASE-6477, WritableByteArrayComparable will no longer be Writable, so 
 should be renamed.
 Current idea is ByteArrayComparator (since all the derived classes are 
 *Comparator not *Comparable), but I'm open to suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

[
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452405#comment-13452405
]

Hadoop QA commented on HBASE-6698:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544509/HBASE-6698_6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2840//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2840//console

This message is automatically generated.

Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
--

Key: HBASE-6698
URL: https://issues.apache.org/jira/browse/HBASE-6698
Project: HBase
Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Fix For: 0.96.0

Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch,
HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch,
HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch

[jira] [Updated] (HBASE-6747) Enable server side limit default on the size of results returned (to prevent OOMs)

2012-09-10 Thread Kannan Muthukkaruppan (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kannan Muthukkaruppan updated HBASE-6747:
-

Summary: Enable server side limit default on the size of results returned
(to prevent OOMs) (was: Enable server side limit on the size of results
returned.)

Enable server side limit default on the size of results returned (to prevent
OOMs)
--

Key: HBASE-6747
URL: https://issues.apache.org/jira/browse/HBASE-6747
Project: HBase
Issue Type: Improvement
Reporter: Amitanand Aiyer
Priority: Minor

We have seen a couple of situations where clients fetching a large row can
cause the whole server to go down, due to large GC pauses/Out of memory error.
This should be easily avoidable, if the client can use a Scan instead of a
Get, and/or use batching to reduce the size. But, it seems difficult to
enforce this. Moreover,
once in a while, there may be genuine outliers/bad clients, that cause such
large requests.
We need to handle such situations gracefully, and not have the RS reboot for
things that can be prevented. The proposal here is to enforce a maximum
response size
at the Server end, so we are not at the mercy of the client's good behavior
to let the server running.
We already log large responses. But, if the response is too large, it just
kills the server. We don't have it logged, and the only way to find out is to
go through the heap dump.
More importantly, our availability/reliability numbers will go down because
the whole region/regionserver fails instead of just the single bad request.
I think it will be useful for the server to maintain a maximum request size
that it will serve. Something large like 2-3G, so normal operations
do not need to be bothered. If a single get/scan operation exceeds the size,
we will just throw an exception for the request. This will
a) avoid the RS from going on and on until it hits out of memory, and
b) will give a cleaner way for the clients, and for us to see what is the
problem.

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452423#comment-13452423
 ] 

stack commented on HBASE-6698:
--

The first rerun above OOME'd.  The second shows no obvious hang.  Retry again.

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Status: Open  (was: Patch Available)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Attachment: HBASE-6698_6.patch

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation


 [ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6698:
-

Status: Patch Available  (was: Open)

 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
 --

 Key: HBASE-6698
 URL: https://issues.apache.org/jira/browse/HBASE-6698
 Project: HBase
  Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.96.0

 Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
 HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
 HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch


 Currently the checkAndPut and checkAndDelete api internally calls the 
 internalPut and internalDelete.  May be we can just call doMiniBatchMutation
 only.  This will help in future like if we have some hooks and the CP
 handles certain cases in the doMiniBatchMutation the same can be done while
 doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

[
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452432#comment-13452432
]

stack commented on HBASE-6698:
--

I took a look at the first change:

{code}
-prepareDelete(delete);
-internalDelete(delete, delete.getClusterId(), writeToWAL);
+doBatchMutate(delete, lid);
{code}

If I look at doBatchMutate, it is missing special handling that prepareDelete
does: e.g. the piece in prepareDelete where if no column family is specified,
we set for each column family in the HTableDescriptor, a special cell w/ the
current timestamp.

My worry is that corner cases are not covered by this mass replace. Please
convince me its just my bad review not catching them. Thanks Priyadarshini.

Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
--

Key: HBASE-6698
URL: https://issues.apache.org/jira/browse/HBASE-6698
Project: HBase
Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Fix For: 0.96.0

Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch,
HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch,
HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch

[jira] [Commented] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable

[
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452433#comment-13452433
]

Hadoop QA commented on HBASE-6658:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544520/HBASE-6658-v6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2841//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2841//console

This message is automatically generated.

Rename WritableByteArrayComparable to something not mentioning Writable
---

Key: HBASE-6658
URL: https://issues.apache.org/jira/browse/HBASE-6658
Project: HBase
Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
Fix For: 0.96.0

Attachments: HBASE-6658.patch, HBASE-6658-v3.patch,
HBASE-6658-v4.patch, HBASE-6658-v5.patch, HBASE-6658-v6.patch

After HBASE-6477, WritableByteArrayComparable will no longer be Writable, so
should be renamed.
Current idea is ByteArrayComparator (since all the derived classes are
*Comparator not *Comparable), but I'm open to suggestions.

[jira] [Commented] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable


[ 
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452440#comment-13452440
 ] 

stack commented on HBASE-6658:
--

+1 on patch.  Thanks for the cleanup.

 Rename WritableByteArrayComparable to something not mentioning Writable
 ---

 Key: HBASE-6658
 URL: https://issues.apache.org/jira/browse/HBASE-6658
 Project: HBase
  Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.96.0

 Attachments: HBASE-6658.patch, HBASE-6658-v3.patch, 
 HBASE-6658-v4.patch, HBASE-6658-v5.patch, HBASE-6658-v6.patch


 After HBASE-6477, WritableByteArrayComparable will no longer be Writable, so 
 should be renamed.
 Current idea is ByteArrayComparator (since all the derived classes are 
 *Comparator not *Comparable), but I'm open to suggestions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6658) Rename WritableByteArrayComparable to something not mentioning Writable

[
https://issues.apache.org/jira/browse/HBASE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452458#comment-13452458
]

Hadoop QA commented on HBASE-6658:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544520/HBASE-6658-v6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 6 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2842//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2842//console

This message is automatically generated.

Rename WritableByteArrayComparable to something not mentioning Writable
---

Key: HBASE-6658
URL: https://issues.apache.org/jira/browse/HBASE-6658
Project: HBase
Issue Type: Bug
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
Fix For: 0.96.0

Attachments: HBASE-6658.patch, HBASE-6658-v3.patch,
HBASE-6658-v4.patch, HBASE-6658-v5.patch, HBASE-6658-v6.patch

[jira] [Commented] (HBASE-6752) On region server failure, serve writes and timeranged reads during the log split


[ 
https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452467#comment-13452467
 ] 

stack commented on HBASE-6752:
--

bq. specific exception returned to the client when we cannot server.

Who would return this?  Not the server that just failed?

Or is it during recovery?  The region will be assigned this new location and 
meta gets updated w/ new location only the region is not fully on line because 
its still recovering?

Or is this when region is moved?

bq. Priority is given to region server holding the new regions

What does this mean?  What kinda of priority?

I like being able to take writes the sooner.

 On region server failure, serve writes and timeranged reads during the log 
 split
 

 Key: HBASE-6752
 URL: https://issues.apache.org/jira/browse/HBASE-6752
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 Opening for write on failure would mean:
 - Assign the region to a new regionserver. It marks the region as recovering
   -- specific exception returned to the client when we cannot server.
   -- allow them to know where they stand. The exception can include some time 
 information (failure stated on: ...)
   -- allow them to go immediately on the right regionserver, instead of 
 retrying or calling the region holding meta to get the new address
  = save network calls, lower the load on meta.
 - Do the split as today. Priority is given to region server holding the new 
 regions
   -- help to share the load balancing code: the split is done by region 
 server considered as available for new regions
   -- help locality (the recovered edits are available on the region server) 
 = lower the network usage
 - When the split is finished, we're done as of today
 - while the split is progressing, the region server can
  -- serve writes
--- that's useful for all application that need to write but not read 
 immediately:
--- whatever logs events to analyze them later
--- opentsdb is a perfect example.   
  -- serve reads if they have a compatible time range. For heavily used 
 tables, it could be an help, because:
--- we can expect to have a few minutes of data only (as it's loaded)
--- the heaviest queries, often accepts a few -or more- minutes delay. 
 Some What if:
 1) the split fails
 = Retry until it works. As today. Just that we serves writes. We need to 
 know (as today) that the region has not recovered if we fail again.
 2) the regionserver fails during the split
 = As 1 and as of today/
 3) the regionserver fails after the split but before the state change to 
 fully available.
 = New assign. More logs to split (the ones already dones and the new ones).
 4) the assignment fails
 = Retry until it works. As today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6427) Pluggable compaction and scan policies via coprocessors


[ 
https://issues.apache.org/jira/browse/HBASE-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452487#comment-13452487
 ] 

Lars Hofhansl commented on HBASE-6427:
--

Not sure. BaseRegionObserver is public, whereas KeyValueScanner and 
InternalScanner are not. The fact that they are in the same class file does not 
matter for the annotation. Right?

 Pluggable compaction and scan policies via coprocessors
 ---

 Key: HBASE-6427
 URL: https://issues.apache.org/jira/browse/HBASE-6427
 Project: HBase
  Issue Type: New Feature
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0, 0.94.2

 Attachments: 6427-0.94-addendum.txt, 6427-0.94.txt, 
 6427-notReady.txt, 6427-v10.txt, 6427-v1.txt, 6427-v2.txt, 6427-v3.txt, 
 6427-v4.txt, 6427-v5.txt, 6427-v7.txt


 When implementing higher level stores on top of HBase it is necessary to 
 allow dynamic control over how long KVs must be kept around.
 Semi-static config options for ColumnFamilies (# of version or TTL) is not 
 sufficient.
 This can be done with a few additional coprocessor hooks, or by makeing 
 Store.ScanInfo pluggable.
 Was:
 The simplest way to achieve this is to have a pluggable class to determine 
 the smallestReadpoint for Region. That way outside code can control what KVs 
 to retain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6752) On region server failure, serve writes and timeranged reads during the log split


[ 
https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452486#comment-13452486
 ] 

nkeywal commented on HBASE-6752:


bq. Who would return this? Not the server that just failed?
If we reassign immediately, the client will go to the new regionserver. So the 
region server will be able to tell it a real status (for example, on reads, we 
can estimate the recovery time left and the regionserver can say: come back in 
20 seconds for this region).

bq. What does this mean? What kinda of priority?
Today, the split is performed by any available RS. If we preassign the regions, 
the split can be done by the regionserver which is owning some of the data 
we're expecting to find in the hlog file...

 On region server failure, serve writes and timeranged reads during the log 
 split
 

 Key: HBASE-6752
 URL: https://issues.apache.org/jira/browse/HBASE-6752
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 Opening for write on failure would mean:
 - Assign the region to a new regionserver. It marks the region as recovering
   -- specific exception returned to the client when we cannot server.
   -- allow them to know where they stand. The exception can include some time 
 information (failure stated on: ...)
   -- allow them to go immediately on the right regionserver, instead of 
 retrying or calling the region holding meta to get the new address
  = save network calls, lower the load on meta.
 - Do the split as today. Priority is given to region server holding the new 
 regions
   -- help to share the load balancing code: the split is done by region 
 server considered as available for new regions
   -- help locality (the recovered edits are available on the region server) 
 = lower the network usage
 - When the split is finished, we're done as of today
 - while the split is progressing, the region server can
  -- serve writes
--- that's useful for all application that need to write but not read 
 immediately:
--- whatever logs events to analyze them later
--- opentsdb is a perfect example.   
  -- serve reads if they have a compatible time range. For heavily used 
 tables, it could be an help, because:
--- we can expect to have a few minutes of data only (as it's loaded)
--- the heaviest queries, often accepts a few -or more- minutes delay. 
 Some What if:
 1) the split fails
 = Retry until it works. As today. Just that we serves writes. We need to 
 know (as today) that the region has not recovered if we fail again.
 2) the regionserver fails during the split
 = As 1 and as of today/
 3) the regionserver fails after the split but before the state change to 
 fully available.
 = New assign. More logs to split (the ones already dones and the new ones).
 4) the assignment fails
 = Retry until it works. As today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6754) Import .META. table exported from 0.94

stack created HBASE-6754:


 Summary: Import .META. table exported from 0.94
 Key: HBASE-6754
 URL: https://issues.apache.org/jira/browse/HBASE-6754
 Project: HBase
  Issue Type: New Feature
Reporter: stack


This issue is a copy of HBASE-6650 except it is not a subtask of HBASE-5305, 
the protobuffing issue.  The below description is copied from HBASE-6650.  

HBASE-6052 converts .META. and ROOT table content to protobuf.

This JIRA allows .META. table exported from 0.94 (see HBASE-3271) to be 
imported into live cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-6650) Import .META. table exported from 0.94


 [ 
https://issues.apache.org/jira/browse/HBASE-6650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6650.
--

Resolution: Duplicate

Resolving as duplicate of a new issue HBASE-6754 (I have to do this to get this 
issue out from under hbase-5305 as a subtask)

 Import .META. table exported from 0.94
 --

 Key: HBASE-6650
 URL: https://issues.apache.org/jira/browse/HBASE-6650
 Project: HBase
  Issue Type: Sub-task
Reporter: Ted Yu

 HBASE-6052 converts .META. and ROOT table content to protobuf.
 This JIRA allows .META. table exported from 0.94 (see HBASE-3271) to be 
 imported into live cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6713) Stopping META/ROOT RS may take 50mins when some region is splitting


 [ 
https://issues.apache.org/jira/browse/HBASE-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6713:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

 Stopping META/ROOT RS may take 50mins when some region is splitting
 ---

 Key: HBASE-6713
 URL: https://issues.apache.org/jira/browse/HBASE-6713
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.2

 Attachments: 6713.92-94, 6713v3.patch, HBASE-6713.patch, 
 HBASE-6713v2.patch


 When we stop the RS carrying ROOT/META, if it is in the splitting for some 
 region, the whole stopping process may take 50 mins.
 The reason is :
 1.ROOT/META region is closed when stopping the regionserver
 2.The Split Transaction failed updating META and it will retry
 3.The retry num is 100, and the total time is about 50 mins as default;
 This configuration is set by 
 HConnectionManager#setServerSideHConnectionRetries
 I think 50 mins is too long to acceptable, my suggested solution is closing 
 MetaTable regions after the compact/split thread is closed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-6713) Stopping META/ROOT RS may take 50mins when some region is splitting


 [ 
https://issues.apache.org/jira/browse/HBASE-6713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6713.
--

Resolution: Fixed

 Stopping META/ROOT RS may take 50mins when some region is splitting
 ---

 Key: HBASE-6713
 URL: https://issues.apache.org/jira/browse/HBASE-6713
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.94.1
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0, 0.94.2

 Attachments: 6713.92-94, 6713v3.patch, HBASE-6713.patch, 
 HBASE-6713v2.patch


 When we stop the RS carrying ROOT/META, if it is in the splitting for some 
 region, the whole stopping process may take 50 mins.
 The reason is :
 1.ROOT/META region is closed when stopping the regionserver
 2.The Split Transaction failed updating META and it will retry
 3.The retry num is 100, and the total time is about 50 mins as default;
 This configuration is set by 
 HConnectionManager#setServerSideHConnectionRetries
 I think 50 mins is too long to acceptable, my suggested solution is closing 
 MetaTable regions after the compact/split thread is closed

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6165) Replication can overrun .META. scans on cluster re-start

2012-09-10 Thread Jean-Daniel Cryans (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452491#comment-13452491
 ] 

Jean-Daniel Cryans commented on HBASE-6165:
---

[~whitingj], originally replication was using the normal handlers and was just 
deadlocking the clusters in a different way. ReplicationSink uses the HBase 
client which can block for ungodly amounts of time so it would fill up the 
handlers and the RS would stop serving requests. HBASE-6550 changed the latter 
that a bit by setting low timeouts via replication-specific client-side 
configuration parameters (if it was using the normal client configurations it 
would also affect all the other clients). With HBASE-6165 it's even safer since 
replication is sandboxed.

 Replication can overrun .META. scans on cluster re-start
 

 Key: HBASE-6165
 URL: https://issues.apache.org/jira/browse/HBASE-6165
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.2

 Attachments: 6165-v6.txt, HBase-6165-94-v1.patch, 
 HBase-6165-94-v2.patch, HBase-6165-v1.patch, HBase-6165-v2.patch, 
 HBase-6165-v3.patch, HBase-6165-v4.patch, HBase-6165-v5.patch


 When restarting a large set of regions on a reasonably small cluster the 
 replication from another cluster tied up every xceiver meaning nothing could 
 be onlined.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6398) Print a warning if there is no local datanode


[ 
https://issues.apache.org/jira/browse/HBASE-6398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452494#comment-13452494
 ] 

Lars Hofhansl commented on HBASE-6398:
--

Yeah, I added all of our folks a while back :)

 Print a warning if there is no local datanode
 -

 Key: HBASE-6398
 URL: https://issues.apache.org/jira/browse/HBASE-6398
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Sameer Vaishampayan
  Labels: noob

 When starting up a RS HBase should print out a warning if there is no 
 datanode locally.  Lots of optimizations are only available if the data is 
 machine local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6752) On region server failure, serve writes and timeranged reads during the log split


[ 
https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452496#comment-13452496
 ] 

stack commented on HBASE-6752:
--

Makes sense.  Sounds great.  How we know what regionserver to give a log split 
too when the log has edits for all regions that were on a regionserver.  You 
thinking we could give all regions on the crashed regionserver to a particular 
regionserver?

 On region server failure, serve writes and timeranged reads during the log 
 split
 

 Key: HBASE-6752
 URL: https://issues.apache.org/jira/browse/HBASE-6752
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 Opening for write on failure would mean:
 - Assign the region to a new regionserver. It marks the region as recovering
   -- specific exception returned to the client when we cannot server.
   -- allow them to know where they stand. The exception can include some time 
 information (failure stated on: ...)
   -- allow them to go immediately on the right regionserver, instead of 
 retrying or calling the region holding meta to get the new address
  = save network calls, lower the load on meta.
 - Do the split as today. Priority is given to region server holding the new 
 regions
   -- help to share the load balancing code: the split is done by region 
 server considered as available for new regions
   -- help locality (the recovered edits are available on the region server) 
 = lower the network usage
 - When the split is finished, we're done as of today
 - while the split is progressing, the region server can
  -- serve writes
--- that's useful for all application that need to write but not read 
 immediately:
--- whatever logs events to analyze them later
--- opentsdb is a perfect example.   
  -- serve reads if they have a compatible time range. For heavily used 
 tables, it could be an help, because:
--- we can expect to have a few minutes of data only (as it's loaded)
--- the heaviest queries, often accepts a few -or more- minutes delay. 
 Some What if:
 1) the split fails
 = Retry until it works. As today. Just that we serves writes. We need to 
 know (as today) that the region has not recovered if we fail again.
 2) the regionserver fails during the split
 = As 1 and as of today/
 3) the regionserver fails after the split but before the state change to 
 fully available.
 = New assign. More logs to split (the ones already dones and the new ones).
 4) the assignment fails
 = Retry until it works. As today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

[
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452495#comment-13452495
]

Hadoop QA commented on HBASE-6698:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12544525/HBASE-6698_6.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

-1 javac. The patch appears to cause mvn compile goal to fail.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2843//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2843//console

This message is automatically generated.

Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
--

Key: HBASE-6698
URL: https://issues.apache.org/jira/browse/HBASE-6698
Project: HBase
Issue Type: Improvement
Reporter: ramkrishna.s.vasudevan
Fix For: 0.96.0

Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch,
HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch,
HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698.patch

[jira] [Commented] (HBASE-6066) some low hanging read path improvement ideas


[ 
https://issues.apache.org/jira/browse/HBASE-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452502#comment-13452502
 ] 

Lars Hofhansl commented on HBASE-6066:
--

Had a super brief look at the patch in phabricator. Looks fine the issue I 
raised above is addressed by checking whether outResults is empty in 
RegionScannerImpl.next(...) and creating a temporary array if not.

The change might introduce new performance issues, though. I'll comment on the 
diff.


 some low hanging read path improvement ideas 
 -

 Key: HBASE-6066
 URL: https://issues.apache.org/jira/browse/HBASE-6066
 Project: HBase
  Issue Type: Improvement
Reporter: Kannan Muthukkaruppan
Assignee: Michal Gregorczyk
Priority: Critical
  Labels: noob
 Attachments: metric-stringbuilder-fix.patch


 I was running some single threaded scan performance tests for a table with 
 small sized rows that is fully cached. Some observations...
 We seem to be doing several wasteful iterations over and/or building of 
 temporary lists.
 1) One such is the following code in HRegionServer.next():
 {code}
boolean moreRows = s.next(values, HRegion.METRIC_NEXTSIZE);
if (!values.isEmpty()) {
  for (KeyValue kv : values) {  --  wasteful in most 
 cases
currentScanResultSize += kv.heapSize();
}
results.add(new Result(values));
 {code}
 By default the maxScannerResultSize is Long.MAX_VALUE. In those cases,
 we can avoid the unnecessary iteration to compute currentScanResultSize.
 2) An example of a wasteful temporary array, is results in
 RegionScanner.next().
 {code}
   results.clear();
   boolean returnResult = nextInternal(limit, metric);
   outResults.addAll(results);
 {code}
 results then gets copied over to outResults via an addAll(). Not sure why we 
 can not directly collect the results in outResults.
 3) Another almost similar exmaple of a wasteful array is results in 
 StoreScanner.next(), which eventually also copies its results into 
 outResults.
 4) Reduce overhead of size metric maintained in StoreScanner.next().
 {code}
   if (metric != null) {
  HRegion.incrNumericMetric(this.metricNamePrefix + metric,
copyKv.getLength());
   }
   results.add(copyKv);
 {code}
 A single call to next() might fetch a lot of KVs. We can first add up the 
 size of those KVs in a local variable and then in a finally clause increment 
 the metric one shot, rather than updating AtomicLongs for each KV.
 5) RegionScanner.next() calls a helper RegionScanner.next() on the same 
 object. Both are synchronized methods. Synchronized methods calling nested 
 synchronized methods on the same object are probably adding some small 
 overhead. The inner next() calls isFilterDone() which is a also a 
 synchronized method. We should factor the code to avoid these nested 
 synchronized methods.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6725) Check and Put can fail when using locks

2012-09-10 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HBASE-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452513#comment-13452513
 ] 

Nicolas Thiébaud commented on HBASE-6725:
-

Done.

 Check and Put can fail when using locks
 ---

 Key: HBASE-6725
 URL: https://issues.apache.org/jira/browse/HBASE-6725
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.4
 Environment: tested on mac os x
Reporter: Nicolas Thiébaud
 Attachments: CAPwithLocks.zip, TestCase_HBASE_6725.patch


 When multiple threads race using CAP with and a lock on the same row, several 
 instances may be allowed to update the cell with the new value (although the 
 expected value is different).
 If all threads race with a wrong expected value and a lock, none will be able 
 to update.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6476) Replace all occurrances of System.currentTimeMillis() with EnvironmentEdge equivalent


[ 
https://issues.apache.org/jira/browse/HBASE-6476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452516#comment-13452516
 ] 

Lars Hofhansl commented on HBASE-6476:
--

We can also add a new method to EnvEdge called WallClockTime (or something). 
That way all code will only refer to EnvEdge and will also we very explicit 
about whether wall clock time is needed to time tests or this is a 
controllable time.


 Replace all occurrances of System.currentTimeMillis() with EnvironmentEdge 
 equivalent
 -

 Key: HBASE-6476
 URL: https://issues.apache.org/jira/browse/HBASE-6476
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.96.0

 Attachments: 6476.txt, 6476-v2.txt, 6476-v2.txt


 There are still some areas where System.currentTimeMillis() is used in HBase. 
 In order to make all parts of the code base testable and (potentially) to be 
 able to configure HBase's notion of time, this should be generally be 
 replaced with EnvironmentEdgeManager.currentTimeMillis().
 How hard would it be to add a maven task that checks for that, so we do not 
 introduce System.currentTimeMillis back in the future?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3869) RegionServer metrics - add read and write byte-transfer statistics


[ 
https://issues.apache.org/jira/browse/HBASE-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452518#comment-13452518
 ] 

Lars Hofhansl commented on HBASE-3869:
--

There is a global metric that accumulates the size of all KVs read by a region 
server, so it's not what you meant, I guess.

 RegionServer metrics - add read and write byte-transfer statistics
 --

 Key: HBASE-3869
 URL: https://issues.apache.org/jira/browse/HBASE-3869
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Priority: Minor

 It would be beneficial to have the data transfer weight of reads and writes 
 per region server.
 HBASE-3647 split out the read/write metric requests from the uber-request 
 metric - which is great.
 But there isn't a notion of data transfer weight and this is why it's 
 important:  the read metrics are effectively RPC-based.  Thus, with a scan 
 caching of 500, there is 1 RPC call every 500 rows read (and 1 'read' metric 
 increment).  And this metric doesn't indicate how much data is being 
 transferred (e.g., a read with 50 attributes will probably cost a lot more 
 than a read with 5 attributes).
  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6533) [replication] replication will block if WAL compress set differently in master and slave configuration

2012-09-10 Thread Jean-Daniel Cryans (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452519#comment-13452519
 ] 

Jean-Daniel Cryans commented on HBASE-6533:
---

[~terry_zhang] your solution is not complete, see 
https://issues.apache.org/jira/browse/HBASE-5778?focusedCommentId=13253995page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13253995

 [replication] replication will block if WAL compress set differently in 
 master and slave configuration
 --

 Key: HBASE-6533
 URL: https://issues.apache.org/jira/browse/HBASE-6533
 Project: HBase
  Issue Type: Bug
  Components: replication
Affects Versions: 0.94.0
Reporter: terry zhang
Assignee: terry zhang
Priority: Critical
 Fix For: 0.94.3

 Attachments: hbase-6533.patch


 as we know in hbase 0.94.0 we have a configuration below
   property
 namehbase.regionserver.wal.enablecompression/name
  valuetrue/value
   /property
 if we enable it in master cluster and disable it in slave cluster . Then 
 replication will not work. It will throw unwrapRemoteException again and 
 again in master cluster.
 2012-08-09 12:49:55,892 WARN 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Can't 
 replicate because of an error
  on the remote cluster: 
 java.io.IOException: IPC server unable to read call parameters: Error in 
 readFields
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
 Method)
 at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
 at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:635)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:365)
 Caused by: org.apache.hadoop.ipc.RemoteException: IPC server unable to read 
 call parameters: Error in readFields
 at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:921)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:151)
 at $Proxy13.replicateLogEntries(Unknown Source)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.shipEdits(ReplicationSource.java:616)
 ... 1 more 
 This is because Slave cluster can not parse the hlog entry .
 2012-08-09 14:46:05,891 WARN org.apache.hadoop.ipc.HBaseServer: Unable to 
 read call parameters for client 10.232.98.89
 java.io.IOException: Error in readFields
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:685)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:586)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:635)
 at 
 org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:125)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1292)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1207)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:735)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:524)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:499)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException
 at java.io.DataInputStream.readFully(DataInputStream.java:180)
 at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:2254)
 at 
 org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:146)
 at 
 org.apache.hadoop.hbase.regionserver.wal.HLog$Entry.readFields(HLog.java:1767)
 at 
 org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:682)
 ... 11 more 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

[jira] [Commented] (HBASE-6720) Optionally limit number of regions balanced in each balancer run


[ 
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452520#comment-13452520
 ] 

Lars Hofhansl commented on HBASE-6720:
--

Hmm... Maybe this is a none-issue now. I'm happy to close as won't fix.

 Optionally limit number of regions balanced in each balancer run
 

 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3

 Attachments: 6720-0.96-v1.txt


 See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6720) Optionally limit number of regions balanced in each balancer run


[ 
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452523#comment-13452523
 ] 

Ted Yu commented on HBASE-6720:
---

That's fine.

 Optionally limit number of regions balanced in each balancer run
 

 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3

 Attachments: 6720-0.96-v1.txt


 See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6720) Optionally limit number of regions balanced in each balancer run

2012-09-10 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452525#comment-13452525
 ] 

Elliott Clark commented on HBASE-6720:
--

The above code won't fix the described issue if regions are moving quickly but 
the load balancer is moving a lot.

 Optionally limit number of regions balanced in each balancer run
 

 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3

 Attachments: 6720-0.96-v1.txt


 See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6720) Optionally limit number of regions balanced in each balancer run


 [ 
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6720:
-

Resolution: Won't Fix
Status: Resolved  (was: Patch Available)

My bad. I thought this still needed fixing.

 Optionally limit number of regions balanced in each balancer run
 

 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3

 Attachments: 6720-0.96-v1.txt


 See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6720) Optionally limit number of regions balanced in each balancer run

2012-09-10 Thread Elliott Clark (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452544#comment-13452544
]

Elliott Clark commented on HBASE-6720:
--

[~lhofhansl] Sorry what I wrote was unclear. I meant to say: The code Ted
posted won't stop the load balancer from moving lots of regions if the region
moves happen quickly. cutoffTime is some time way in the future assuming that
things are going normally. So since applying the plans is usually very quick.
All of the plans will be applied.

If anything I would stopping the LoadBalancer from moving after a certain
number of regions is more useful than stopping it after a number of
milliseconds. It's easier to for users to know a good starting value that
applies to their cluster. There's no way I could tell you right now how long a
balance on our cluster takes when a RegionServer goes down. However I could
say the max number of regions that I would want to move at a time.

Optionally limit number of regions balanced in each balancer run

Key: HBASE-6720
URL: https://issues.apache.org/jira/browse/HBASE-6720
Project: HBase
Issue Type: Bug
Reporter: Lars Hofhansl
Fix For: 0.96.0, 0.94.3

Attachments: 6720-0.96-v1.txt

See discussion on HBASE-3866

[jira] [Commented] (HBASE-6752) On region server failure, serve writes and timeranged reads during the log split

2012-09-10 Thread Kannan Muthukkaruppan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452549#comment-13452549
 ] 

Kannan Muthukkaruppan commented on HBASE-6752:
--

There might be a bunch of nitty gritties to be ironed out-- but being able to 
take writes nearly all the time would be a very nice win. So big +1 for 
exploring this effort. Will throw out a few things that come to mind:

* We do want the old edits to come in the correct order of sequence ids (i.e. 
be considered older than the newer puts that arrive when the region is in 
recovery mode, correct)? So, we somehow need to cheaply find the correct 
sequence id to use for the new puts. It needs to be bigger than sequence ids 
for all the edits for that region in the log files. So maybe all that's needed 
here is to open recover the latest log file, and scan it to find the last 
sequence id?

* Picking a winner among duplicates in two files relies on using sequence id of 
the HFile as a tie-break. And therefore, today, compactions always pick  a 
dense subrange of files order by sequence ids. That is if we have HFiles a, b, 
c, d, e sorted by sequence id, we might compact a,b,c or c,d,e but never say 
a,d,e. With this new scheme, we should take care that we don't violate this 
property. The old data should correctly be recovered into HFiles with the 
correct sequence id.. and even if newer data has been flushed before the 
recovery is complete we shouldn't compact those newer files with older HFiles 
given that some new files are supposed to come in between (after recovery).





 On region server failure, serve writes and timeranged reads during the log 
 split
 

 Key: HBASE-6752
 URL: https://issues.apache.org/jira/browse/HBASE-6752
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.96.0
Reporter: nkeywal
Priority: Minor

 Opening for write on failure would mean:
 - Assign the region to a new regionserver. It marks the region as recovering
   -- specific exception returned to the client when we cannot server.
   -- allow them to know where they stand. The exception can include some time 
 information (failure stated on: ...)
   -- allow them to go immediately on the right regionserver, instead of 
 retrying or calling the region holding meta to get the new address
  = save network calls, lower the load on meta.
 - Do the split as today. Priority is given to region server holding the new 
 regions
   -- help to share the load balancing code: the split is done by region 
 server considered as available for new regions
   -- help locality (the recovered edits are available on the region server) 
 = lower the network usage
 - When the split is finished, we're done as of today
 - while the split is progressing, the region server can
  -- serve writes
--- that's useful for all application that need to write but not read 
 immediately:
--- whatever logs events to analyze them later
--- opentsdb is a perfect example.   
  -- serve reads if they have a compatible time range. For heavily used 
 tables, it could be an help, because:
--- we can expect to have a few minutes of data only (as it's loaded)
--- the heaviest queries, often accepts a few -or more- minutes delay. 
 Some What if:
 1) the split fails
 = Retry until it works. As today. Just that we serves writes. We need to 
 know (as today) that the region has not recovered if we fail again.
 2) the regionserver fails during the split
 = As 1 and as of today/
 3) the regionserver fails after the split but before the state change to 
 fully available.
 = New assign. More logs to split (the ones already dones and the new ones).
 4) the assignment fails
 = Retry until it works. As today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6720) Optionally limit number of regions balanced in each balancer run


[ 
https://issues.apache.org/jira/browse/HBASE-6720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452551#comment-13452551
 ] 

Lars Hofhansl commented on HBASE-6720:
--

Hey Elliot... We had comment crossing. :)

Ok, so I take it, the patch is good after all, and we should apply (with the 
move of the constant as Stack suggests).
(Didn't spend much time actually looking at the code... Should probably do so).


 Optionally limit number of regions balanced in each balancer run
 

 Key: HBASE-6720
 URL: https://issues.apache.org/jira/browse/HBASE-6720
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.3

 Attachments: 6720-0.96-v1.txt


 See discussion on HBASE-3866

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6165) Replication can overrun .META. scans on cluster re-start

2012-09-10 Thread Jeff Whiting (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452552#comment-13452552
 ] 

Jeff Whiting commented on HBASE-6165:
-

@stack and @jdcryans  Thanks for the explanation.  I can see how it would 
deadlock on itself. I also found HBASE-3401 which talks about the deadlock.  We 
patched our cdh4 cluster with HBASE-6724 and it has been running much smoother. 
 

 Replication can overrun .META. scans on cluster re-start
 

 Key: HBASE-6165
 URL: https://issues.apache.org/jira/browse/HBASE-6165
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.2

 Attachments: 6165-v6.txt, HBase-6165-94-v1.patch, 
 HBase-6165-94-v2.patch, HBase-6165-v1.patch, HBase-6165-v2.patch, 
 HBase-6165-v3.patch, HBase-6165-v4.patch, HBase-6165-v5.patch


 When restarting a large set of regions on a reasonably small cluster the 
 replication from another cluster tied up every xceiver meaning nothing could 
 be onlined.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6165) Replication can overrun .META. scans on cluster re-start

2012-09-10 Thread Himanshu Vashishtha (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452557#comment-13452557
 ] 

Himanshu Vashishtha commented on HBASE-6165:


[~whitingj] Specifically, replication specific jira about deadlocking on normal 
handlers is HBASE-4280.

 Replication can overrun .META. scans on cluster re-start
 

 Key: HBASE-6165
 URL: https://issues.apache.org/jira/browse/HBASE-6165
 Project: HBase
  Issue Type: Bug
Reporter: Elliott Clark
Assignee: Himanshu Vashishtha
 Fix For: 0.96.0, 0.94.2

 Attachments: 6165-v6.txt, HBase-6165-94-v1.patch, 
 HBase-6165-94-v2.patch, HBase-6165-v1.patch, HBase-6165-v2.patch, 
 HBase-6165-v3.patch, HBase-6165-v4.patch, HBase-6165-v5.patch


 When restarting a large set of regions on a reasonably small cluster the 
 replication from another cluster tied up every xceiver meaning nothing could 
 be onlined.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-10 Thread Jean-Daniel Cryans (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452581#comment-13452581
 ] 

Jean-Daniel Cryans commented on HBASE-6649:
---

bq. This is because of multiple calls to reader.next within 
readAllEntriesToReplicateOrNextFile. If the second call (within the while loop) 
throws an exception (like EOFException), it basically destroys the work done up 
until then. Therefore, some rows would never be replicated.

The position in the log is updated in ZK only once the edits are replicated 
hence, even if you fail on the second or hundredth edit, the next region server 
that will be in charge of that log will pick up where the previous RS was (even 
if that means re-reading some edits).

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
 ---

 Key: HBASE-6649
 URL: https://issues.apache.org/jira/browse/HBASE-6649
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.96.0, 0.92.3, 0.94.2

 Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
 6649-trunk.patch, 6649-trunk.patch, 6649.txt, HBase-0.92 #495 test - 
 queueFailover [Jenkins].html, HBase-0.92 #502 test - queueFailover 
 [Jenkins].html


 Have seen it twice in the recent past: http://bit.ly/MPCykB  
 http://bit.ly/O79Dq7 .. 
 Looking briefly at the logs hints at a pattern - in both the failed test 
 instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-09-10 Thread Devaraj Das (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Devaraj Das updated HBASE-6733:
---

Attachment: 6733-1.patch

Attached is a simple patch (for trunk) to take care of the first problem
(described in the description of the jira). The patch resets the
sleepMultiplier in every place where currentPath gets (re)assigned. Could we
please commit this one.
I am looking at the other issue but seems like it will take more time.

[0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
---

Key: HBASE-6733
URL: https://issues.apache.org/jira/browse/HBASE-6733
Project: HBase
Issue Type: Bug
Reporter: Devaraj Das
Fix For: 0.92.3

Attachments: 6733-1.patch

The failure is in TestReplication.queueFailover (fails due to unreplicated
rows). I have come across two problems:
1. The sleepMultiplier is not properly reset when the currentPath is changed
(in ReplicationSource.java).
2. ReplicationExecutor sometime removes files to replicate from the queue too
early, resulting in corresponding edits missing. Here the problem is due to
the fact the log-file length that the replication executor finds is not the
most updated one, and hence it doesn't read anything from there, and
ultimately, when there is a log roll, the replication-queue gets a new entry,
and the executor drops the old entry out of the queue.

[jira] [Commented] (HBASE-6241) HBaseCluster interface for interacting with the cluster from system tests

2012-09-10 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452610#comment-13452610
 ] 

Enis Soztutar commented on HBASE-6241:
--

Thanks Stack for the review. I'll put together smt to go over for tomorrow. 
Let's finalize the patch, and continue building upon it. 

 HBaseCluster interface for interacting with the cluster from system tests 
 --

 Key: HBASE-6241
 URL: https://issues.apache.org/jira/browse/HBASE-6241
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Attachments: HBASE-6241_v0.2.patch, HBASE-6241_v1.patch


 We need to abstract away the cluster interactions for system tests running on 
 actual clusters. 
 MiniHBaseCluster and RealHBaseCluster should both implement this interface, 
 and system tests should work with both.
 I'll split Devaraj's patch in HBASE-6053 for the initial version. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6748) Endless recursive of deleteNode happened in SplitLogManager#DeleteAsyncCallback


[ 
https://issues.apache.org/jira/browse/HBASE-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13452618#comment-13452618
 ] 

Jieshan Bean commented on HBASE-6748:
-

From the logs, we can see SessionTimeoutException happened. So the request of 
deleteNode was rejected each time.
DeleteAsyncCallback does not handle the exceptions correctly.

 Endless recursive of deleteNode happened in 
 SplitLogManager#DeleteAsyncCallback
 ---

 Key: HBASE-6748
 URL: https://issues.apache.org/jira/browse/HBASE-6748
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.0, 0.94.1
Reporter: Jieshan Bean
Priority: Critical

 You can ealily understand the problem from the below logs:
 {code}
 [2012-09-01 11:41:02,062] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=3
 [2012-09-01 11:41:02,062] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=2
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=1
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978] 
 create rc =SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=0
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager 393] failed to create task 
 node/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager 353] Error splitting 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
 [2012-09-01 11:41:02,063] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775807
 [2012-09-01 11:41:02,064] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775806
 [2012-09-01 11:41:02,064] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775805
 [2012-09-01 11:41:02,064] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining retries=9223372036854775804
 [2012-09-01 11:41:02,065] [WARN ] 
 [MASTER_SERVER_OPERATIONS-xh03,2,1339549619270-1] 
 [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052] 
 delete rc=SESSIONEXPIRED for 
 /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
  remaining

[jira] [Updated] (HBASE-6733) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]

2012-09-10 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-6733:
---

Assignee: Devaraj Das
  Status: Patch Available  (was: Open)

 [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-2]
 ---

 Key: HBASE-6733
 URL: https://issues.apache.org/jira/browse/HBASE-6733
 Project: HBase
  Issue Type: Bug
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.92.3

 Attachments: 6733-1.patch


 The failure is in TestReplication.queueFailover (fails due to unreplicated 
 rows). I have come across two problems:
 1. The sleepMultiplier is not properly reset when the currentPath is changed 
 (in ReplicationSource.java).
 2. ReplicationExecutor sometime removes files to replicate from the queue too 
 early, resulting in corresponding edits missing. Here the problem is due to 
 the fact the log-file length that the replication executor finds is not the 
 most updated one, and hence it doesn't read anything from there, and 
 ultimately, when there is a log roll, the replication-queue gets a new entry, 
 and the executor drops the old entry out of the queue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6260) balancer state should be stored in ZK