[jira] [Commented] (HBASE-12331) Shorten the mob snapshot unit tests

2014-11-07 Thread Li Jiajia (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201790#comment-14201790
 ] 

Li Jiajia commented on HBASE-12331:
---

hi, [~jmhsieh] , in the HBASE-12332 has some improvement of read clone cell, 
and drop some time cost of unit tests.  We can write less data to finish in 
less time, then most of them(except the case of exporting snapshot) are 
finished within 100 seconds. And the TestExportSnapshot also takes a long time, 
so do we still need to move these unit tests to integration tests? Please 
advise. Thanks.

 Shorten the mob snapshot unit tests
 ---

 Key: HBASE-12331
 URL: https://issues.apache.org/jira/browse/HBASE-12331
 Project: HBase
  Issue Type: Sub-task
  Components: mob
Affects Versions: hbase-11339
Reporter: Jonathan Hsieh
 Fix For: hbase-11339

 Attachments: HBASE-12331-V1.diff


 The mob snapshot patch introduced a whole log of tests that take a long time 
 to run and would be better as integration tests.
 {code}
 ---
  T E S T S
 ---
 Running 
 org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 394.803 sec - 
 in 
 org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClientWithRegionReplicas
 Running org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 212.377 sec - 
 in org.apache.hadoop.hbase.client.TestMobRestoreSnapshotFromClient
 Running 
 org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 49.463 sec - 
 in org.apache.hadoop.hbase.client.TestMobSnapshotFromClientWithRegionReplicas
 Running org.apache.hadoop.hbase.client.TestMobSnapshotFromClient
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 46.724 sec - 
 in org.apache.hadoop.hbase.client.TestMobSnapshotFromClient
 Running org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 204.03 sec - 
 in org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient
 Running 
 org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas
 Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 214.052 sec - 
 in 
 org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClientWithRegionReplicas
 Running org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
 Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 105.139 sec - 
 in org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence
 Running org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.42 sec - 
 in org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
 Running org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.136 sec - 
 in org.apache.hadoop.hbase.regionserver.TestDeleteMobTable
 Running org.apache.hadoop.hbase.regionserver.TestHMobStore
 Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.09 sec - in 
 org.apache.hadoop.hbase.regionserver.TestHMobStore
 Running org.apache.hadoop.hbase.regionserver.TestMobCompaction
 Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.629 sec - 
 in org.apache.hadoop.hbase.regionserver.TestMobCompaction
 Running org.apache.hadoop.hbase.mob.TestCachedMobFile
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.301 sec - 
 in org.apache.hadoop.hbase.mob.TestCachedMobFile
 Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.752 sec - 
 in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepJob
 Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.276 sec - 
 in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer
 Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.46 sec - 
 in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepMapper
 Running org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
 Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 173.05 sec - 
 in org.apache.hadoop.hbase.mob.mapreduce.TestMobSweeper
 Running org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding
 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.86 sec - 
 in org.apache.hadoop.hbase.mob.TestMobDataBlockEncoding
 Running 

[jira] [Updated] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated HBASE-12279:
-
Status: Open  (was: Patch Available)

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0, 0.98.0, 0.94.0
Reporter: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated HBASE-12279:
-
Attachment: HBASE-12279-2014-11-07-v2.patch

Because HBASE-12272 has been committed this patch can now be done simply 
created by running
{code}mvn generate-sources -Pcompile-thrift{code}
on a clean checkout of the source tree.

To ensure this doesn't break any existing tests I've attached the patch for the 
current master branch so Jenkins can do a verification run.

I think that this patch file shouldn't be used for the actual commit. I think 
the above command should be used as it is much easier to do the same on all 
branches.

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Niels Basjes (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niels Basjes updated HBASE-12279:
-
Status: Patch Available  (was: Open)

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0, 0.98.0, 0.94.0
Reporter: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12440:
--
Attachment: HBASE-12440-0.98_v2.patch
HBASE-12440-branch-1.patch

Thanks for the review [~apurtell]
v2 removes the changes in ServerManager. I overthought the test before.
Also on branch-1, one of the tests started failing with v1 as the case where 
the table is disabled before SSH tries to do the assign was not handled. v2 
adds a check for that.

 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.1

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12440:
--
Component/s: Region Assignment

 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.1

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201905#comment-14201905
 ] 

Hadoop QA commented on HBASE-12279:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12680115/HBASE-12279-2014-11-07-v2.patch
  against trunk revision .
  ATTACHMENT ID: 12680115

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color}.  The applied patch generated 108 javac compiler 
warnings (more than the trunk's current 102 warnings).

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+lastComparison = 
Boolean.valueOf(isSetAuthorizations()).compareTo(typedOther.isSetAuthorizations());
+lastComparison = 
Boolean.valueOf(isSetCellVisibility()).compareTo(typedOther.isSetCellVisibility());
+lastComparison = 
Boolean.valueOf(isSetAuthorizations()).compareTo(typedOther.isSetAuthorizations());

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11613//console

This message is automatically generated.

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the 

[jira] [Updated] (HBASE-10483) Provide API for retrieving info port when hbase.master.info.port is set to 0

2014-11-07 Thread Liu Shaohui (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liu Shaohui updated HBASE-10483:

Attachment: HBASE-10483-v3.diff

A new patch for hbase master.
- Add info port field to master pb in zk
- Client, RegionServersand Backup Masters get active master's info port through 
MasterAddressTracker.

[~stack] [~tedyu] [~enis]
Please help to review this patch, thx.

 Provide API for retrieving info port when hbase.master.info.port is set to 0
 

 Key: HBASE-10483
 URL: https://issues.apache.org/jira/browse/HBASE-10483
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Liu Shaohui
 Attachments: HBASE-10483-trunk-v1.diff, HBASE-10483-trunk-v2.diff, 
 HBASE-10483-v3.diff


 When hbase.master.info.port is set to 0, info port is dynamically determined.
 An API should be provided so that client can retrieve the actual info port.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12443) After increasing the TTL value of a Hbase Table , table gets inaccessible. Scan table not working.

2014-11-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202175#comment-14202175
 ] 

Lars Hofhansl commented on HBASE-12443:
---

If this is not the same issue, feel free to reopen of course.

 After increasing the TTL value of a Hbase Table , table gets inaccessible. 
 Scan table not working.
 --

 Key: HBASE-12443
 URL: https://issues.apache.org/jira/browse/HBASE-12443
 Project: HBase
  Issue Type: Bug
  Components: HFile
Reporter: Prabhu Joseph
Priority: Blocker
 Fix For: 2.0.0


 After increasing the TTL value of a Hbase Table , table gets inaccessible. 
 Scan table not working.
 Scan in hbase shell throws
 java.lang.IllegalStateException: Block index not loaded
 at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1.blockContainingKey(HFileReaderV1.java:181)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1$AbstractScannerV1.seekTo(HFileReaderV1.java:426)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:131)
 at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2015)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3706)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1761)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1753)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1730)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2409)
 at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
 STEPS to Reproduce:
  create 'debugger',{NAME = 'd',TTL = 15552000}
  put 'debugger','jdb','d:desc','Java debugger',1399699792000
  disable 'debugger'
 alter 'debugger',{NAME = 'd',TTL = 6912}
 enable 'debugger'
 scan 'debugger'
 Reason for the issue:
When inserting already expired data in debugger table, hbase creates a 
 hfile with empty data 
 block and index block. On scanning table, StoreFile.Reader checks whether the 
 TimeRangeTracker's maximum timestamp is greater than ttl value, so it skips 
 the empty file.
   But when ttl is changed, the maximum timestamp will be lesser than ttl 
 value, so StoreFile.Reader tries to read index block from HFile leading to 
 java.lang.IllegalStateException: Block index not loaded.
 SOLUTION:
 StoreFile.java 
boolean passesTimerangeFilter(Scan scan, long oldestUnexpiredTS) {
   if (timeRangeTracker == null) {
 return true;
   } else {
 return timeRangeTracker.includesTimeRange(scan.getTimeRange()) 
 timeRangeTracker.getMaximumTimestamp() = oldestUnexpiredTS;
   }
 }
 In the above method, by checking whether there are entries in the hfile by 
 using FixedFileTrailer
 block we can skip scanning the empty hfile.
 // changed code will solve the issue
  boolean passesTimerangeFilter(Scan scan, long oldestUnexpiredTS) {
   if (timeRangeTracker == null) {
 return true;
   } else {
 return timeRangeTracker.includesTimeRange(scan.getTimeRange()) 
 timeRangeTracker.getMaximumTimestamp() = oldestUnexpiredTS  
 reader.getEntries()0;
   }
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12443) After increasing the TTL value of a Hbase Table , table gets inaccessible. Scan table not working.

2014-11-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202508#comment-14202508
 ] 

Lars Hofhansl commented on HBASE-12443:
---

And if you have a patch please post it. :)


 After increasing the TTL value of a Hbase Table , table gets inaccessible. 
 Scan table not working.
 --

 Key: HBASE-12443
 URL: https://issues.apache.org/jira/browse/HBASE-12443
 Project: HBase
  Issue Type: Bug
  Components: HFile
Reporter: Prabhu Joseph
Priority: Blocker
 Fix For: 2.0.0


 After increasing the TTL value of a Hbase Table , table gets inaccessible. 
 Scan table not working.
 Scan in hbase shell throws
 java.lang.IllegalStateException: Block index not loaded
 at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1.blockContainingKey(HFileReaderV1.java:181)
 at 
 org.apache.hadoop.hbase.io.hfile.HFileReaderV1$AbstractScannerV1.seekTo(HFileReaderV1.java:426)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:226)
 at 
 org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:145)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.init(StoreScanner.java:131)
 at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2015)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.init(HRegion.java:3706)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1761)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1753)
 at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1730)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2409)
 at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
 at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
 STEPS to Reproduce:
  create 'debugger',{NAME = 'd',TTL = 15552000}
  put 'debugger','jdb','d:desc','Java debugger',1399699792000
  disable 'debugger'
 alter 'debugger',{NAME = 'd',TTL = 6912}
 enable 'debugger'
 scan 'debugger'
 Reason for the issue:
When inserting already expired data in debugger table, hbase creates a 
 hfile with empty data 
 block and index block. On scanning table, StoreFile.Reader checks whether the 
 TimeRangeTracker's maximum timestamp is greater than ttl value, so it skips 
 the empty file.
   But when ttl is changed, the maximum timestamp will be lesser than ttl 
 value, so StoreFile.Reader tries to read index block from HFile leading to 
 java.lang.IllegalStateException: Block index not loaded.
 SOLUTION:
 StoreFile.java 
boolean passesTimerangeFilter(Scan scan, long oldestUnexpiredTS) {
   if (timeRangeTracker == null) {
 return true;
   } else {
 return timeRangeTracker.includesTimeRange(scan.getTimeRange()) 
 timeRangeTracker.getMaximumTimestamp() = oldestUnexpiredTS;
   }
 }
 In the above method, by checking whether there are entries in the hfile by 
 using FixedFileTrailer
 block we can skip scanning the empty hfile.
 // changed code will solve the issue
  boolean passesTimerangeFilter(Scan scan, long oldestUnexpiredTS) {
   if (timeRangeTracker == null) {
 return true;
   } else {
 return timeRangeTracker.includesTimeRange(scan.getTimeRange()) 
 timeRangeTracker.getMaximumTimestamp() = oldestUnexpiredTS  
 reader.getEntries()0;
   }
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12012) Improve cancellation for the scan RPCs

2014-11-07 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202514#comment-14202514
 ] 

Devaraj Das commented on HBASE-12012:
-

[~stack] this patch mostly refactors the scan code to use the (improved client 
side) cancellation that is used in the other RPC parts of the code. I think I 
need to update the code a little. The tests on HBASE-11564 were with this, yes.
I think I need to update the patch some to take into account some fixes that I 
did after cluster testing. Will post one soon.

 Improve cancellation for the scan RPCs
 --

 Key: HBASE-12012
 URL: https://issues.apache.org/jira/browse/HBASE-12012
 Project: HBase
  Issue Type: Sub-task
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 2.0.0, 0.99.2

 Attachments: 12012-1.txt


 Similar to HBASE-11564 but for scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202520#comment-14202520
 ] 

Andrew Purtell commented on HBASE-12440:


The v2 patch lgtm, let me check for a bit that it doesn't cause any tests to 
flap or anything like that and will then commit the latest 0.98 and branch-1 
patches on this issue. Thanks Virag.

 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.1

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12272) Generate Thrift code through maven

2014-11-07 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12272:
--
Fix Version/s: (was: 0.94.26)
   0.94.25

 Generate Thrift code through maven
 --

 Key: HBASE-12272
 URL: https://issues.apache.org/jira/browse/HBASE-12272
 Project: HBase
  Issue Type: Improvement
  Components: build, documentation, Thrift
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.25, 0.99.2

 Attachments: HBASE-12272-2014-10-15-v1-PREVIEW.patch, 
 HBASE-12272-2014-10-16-v2.patch, HBASE-12272-2014-10-16-v3.patch, 
 HBASE-12272-2014-10-16-v4.patch, HBASE-12272-2014-11-04-v5.patch, 
 HBASE-12272-2014-11-05-v5.patch, HBASE-12272-2014-11-05-v5.patch


 The generated thrift code is currently under source control, but the 
 instructions on rebuilding it is buried in package javadocs.
 We should have a simple maven command to rebuild them, similar to what we 
 have for protobufs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-5162) Basic client pushback mechanism

2014-11-07 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5162:
---
Status: Open  (was: Patch Available)

 Basic client pushback mechanism
 ---

 Key: HBASE-5162
 URL: https://issues.apache.org/jira/browse/HBASE-5162
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jesse Yates
 Fix For: 1.0.0

 Attachments: hbase-5162-trunk-v0.patch, hbase-5162-trunk-v1.patch, 
 hbase-5162-trunk-v2.patch, hbase-5162-trunk-v3.patch, 
 hbase-5162-trunk-v4.patch, hbase-5162-trunk-v5.patch, java_HBASE-5162.patch


 The current blocking we do when we are close to some limits (memstores over 
 the multiplier factor, too many store files, global memstore memory) is bad, 
 too coarse and confusing. After hitting HBASE-5161, it really becomes obvious 
 that we need something better.
 I did a little brainstorm with Stack, we came up quickly with two solutions:
  - Send some exception to the client, like OverloadedException, that's thrown 
 when some situation happens like getting past the low memory barrier. It 
 would be thrown when the client gets a handler and does some check while 
 putting or deleting. The client would treat this a retryable exception but 
 ideally wouldn't check .META. for a new location. It could be fancy and have 
 multiple levels of pushback, like send the exception to 25% of the clients, 
 and then go up if the situation persists. Should be easy to implement but 
 we'll be using a lot more IO to send the payload over and over again (but at 
 least it wouldn't sit in the RS's memory).
  - Send a message alongside a successful put or delete to tell the client to 
 slow down a little, this way we don't have to do back and forth with the 
 payload between the client and the server. It's a cleaner (I think) but more 
 involved solution.
 In every case the RS should do very obvious things to notify the operators of 
 this situation, through logs, web UI, metrics, etc.
 Other ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-5162) Basic client pushback mechanism

2014-11-07 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5162:
---
Attachment: hbase-5162-trunk-v6.patch

Updated patch on latest master (we were getting behind a little bit) and 
hopefully fixing the checkstyle and findbugs issues.

 Basic client pushback mechanism
 ---

 Key: HBASE-5162
 URL: https://issues.apache.org/jira/browse/HBASE-5162
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jesse Yates
 Fix For: 1.0.0

 Attachments: hbase-5162-trunk-v0.patch, hbase-5162-trunk-v1.patch, 
 hbase-5162-trunk-v2.patch, hbase-5162-trunk-v3.patch, 
 hbase-5162-trunk-v4.patch, hbase-5162-trunk-v5.patch, 
 hbase-5162-trunk-v6.patch, java_HBASE-5162.patch


 The current blocking we do when we are close to some limits (memstores over 
 the multiplier factor, too many store files, global memstore memory) is bad, 
 too coarse and confusing. After hitting HBASE-5161, it really becomes obvious 
 that we need something better.
 I did a little brainstorm with Stack, we came up quickly with two solutions:
  - Send some exception to the client, like OverloadedException, that's thrown 
 when some situation happens like getting past the low memory barrier. It 
 would be thrown when the client gets a handler and does some check while 
 putting or deleting. The client would treat this a retryable exception but 
 ideally wouldn't check .META. for a new location. It could be fancy and have 
 multiple levels of pushback, like send the exception to 25% of the clients, 
 and then go up if the situation persists. Should be easy to implement but 
 we'll be using a lot more IO to send the payload over and over again (but at 
 least it wouldn't sit in the RS's memory).
  - Send a message alongside a successful put or delete to tell the client to 
 slow down a little, this way we don't have to do back and forth with the 
 payload between the client and the server. It's a cleaner (I think) but more 
 involved solution.
 In every case the RS should do very obvious things to notify the operators of 
 this situation, through logs, web UI, metrics, etc.
 Other ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-5162) Basic client pushback mechanism

2014-11-07 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5162:
---
Status: Patch Available  (was: Open)

 Basic client pushback mechanism
 ---

 Key: HBASE-5162
 URL: https://issues.apache.org/jira/browse/HBASE-5162
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jesse Yates
 Fix For: 1.0.0

 Attachments: hbase-5162-trunk-v0.patch, hbase-5162-trunk-v1.patch, 
 hbase-5162-trunk-v2.patch, hbase-5162-trunk-v3.patch, 
 hbase-5162-trunk-v4.patch, hbase-5162-trunk-v5.patch, 
 hbase-5162-trunk-v6.patch, java_HBASE-5162.patch


 The current blocking we do when we are close to some limits (memstores over 
 the multiplier factor, too many store files, global memstore memory) is bad, 
 too coarse and confusing. After hitting HBASE-5161, it really becomes obvious 
 that we need something better.
 I did a little brainstorm with Stack, we came up quickly with two solutions:
  - Send some exception to the client, like OverloadedException, that's thrown 
 when some situation happens like getting past the low memory barrier. It 
 would be thrown when the client gets a handler and does some check while 
 putting or deleting. The client would treat this a retryable exception but 
 ideally wouldn't check .META. for a new location. It could be fancy and have 
 multiple levels of pushback, like send the exception to 25% of the clients, 
 and then go up if the situation persists. Should be easy to implement but 
 we'll be using a lot more IO to send the payload over and over again (but at 
 least it wouldn't sit in the RS's memory).
  - Send a message alongside a successful put or delete to tell the client to 
 slow down a little, this way we don't have to do back and forth with the 
 payload between the client and the server. It's a cleaner (I think) but more 
 involved solution.
 In every case the RS should do very obvious things to notify the operators of 
 this situation, through logs, web UI, metrics, etc.
 Other ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10201) Port 'Make flush decisions per column family' to trunk

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10201:
---
Fix Version/s: 0.98.9

 Port 'Make flush decisions per column family' to trunk
 --

 Key: HBASE-10201
 URL: https://issues.apache.org/jira/browse/HBASE-10201
 Project: HBase
  Issue Type: Improvement
  Components: wal
Reporter: Ted Yu
Assignee: zhangduo
Priority: Critical
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: 3149-trunk-v1.txt, HBASE-10201-0.98.patch, 
 HBASE-10201-0.98_1.patch, HBASE-10201-0.98_2.patch, HBASE-10201-0.99.patch, 
 HBASE-10201.patch, HBASE-10201_1.patch, HBASE-10201_2.patch, 
 HBASE-10201_3.patch, HBASE-10201_4.patch, HBASE-10201_5.patch


 Currently the flush decision is made using the aggregate size of all column 
 families. When large and small column families co-exist, this causes many 
 small flushes of the smaller CF. We need to make per-CF flush decisions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202582#comment-14202582
 ] 

Andrew Purtell commented on HBASE-12346:


Yes, a new SLG. The proposed change to EnforcingScanLabelGenerator would remove 
its essential feature. 

Doesn't have to be complex to configure from the user's perspective, we could 
provide canned shortcut configuration strings that expand into SLG stacks. 
Documentation would be good. 

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12346-master-v2.patch, 
 HBASE-12346-master-v3.patch, HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12424) Finer grained logging and metrics for split transactions

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12424:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to 0.98+. Thanks for the review [~jesse_yates]

 Finer grained logging and metrics for split transactions
 

 Key: HBASE-12424
 URL: https://issues.apache.org/jira/browse/HBASE-12424
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 
 0001-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0002-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0003-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 HBASE-12424-0.98.patch, HBASE-12424.patch, HBASE-12424.patch, 
 HBASE-12424.patch, HowHBaseRegionSplitsareImplemented.pdf


 A split transaction is a complex orchestration of activity between the 
 RegionServer, Master, ZooKeeper, and HDFS NameNode. We have some visibility 
 into the time taken by various phases of the split transaction in the logs. 
 We will see Starting split of region $PARENT before the transaction begins, 
 before the parent is offlined. Later we will see Opening $DAUGHTER as one 
 of the last steps in the transaction, this is after the parent has been 
 flushed, offlined, and closed. Finally Region split, hbase:meta updated, 
 and report to master ... Split took $TIME after all steps are complete and 
 including the total running time of the transaction. 
 For debugging the cause(s) of long running split transactions it would be 
 useful to know the distribution of time spent in all of the phases of the 
 split transaction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12336) RegionServer failed to shutdown for NodeFailoverWorker thread

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12336:
---
Fix Version/s: (was: 0.98.9)
   0.98.8

 RegionServer failed to shutdown for NodeFailoverWorker thread
 -

 Key: HBASE-12336
 URL: https://issues.apache.org/jira/browse/HBASE-12336
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.11
Reporter: Liu Shaohui
Assignee: Liu Shaohui
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.94.25, 0.99.2

 Attachments: HBASE-12336-trunk-v1.diff, stack


 After enabling hbase.zookeeper.useMulti in hbase cluster, we found that 
 regionserver failed to shutdown. Other threads have exited except a 
 NodeFailoverWorker thread.
 {code}
 ReplicationExecutor-0 prio=10 tid=0x7f0d40195ad0 nid=0x73a in 
 Object.wait() [0x7f0dc8fe6000]
java.lang.Thread.State: WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Object.wait(Object.java:485)
 at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1309)
 - locked 0x0005a16df080 (a 
 org.apache.zookeeper.ClientCnxn$Packet)
 at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:930)
 at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:912)
 at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:531)
 at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1518)
 at 
 org.apache.hadoop.hbase.replication.ReplicationZookeeper.copyQueuesFromRSUsingMulti(ReplicationZookeeper.java:804)
 at 
 org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager$NodeFailoverWorker.run(ReplicationSourceManager.java:612)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 It's sure that the shutdown method of the executor is called in  
 ReplicationSourceManager#join.
  
 I am looking for the root cause and suggestions are welcomed. Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12381) Add maven enforcer rules for build assumptions

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12381:
---
Fix Version/s: (was: 0.98.9)
   0.98.8

 Add maven enforcer rules for build assumptions
 --

 Key: HBASE-12381
 URL: https://issues.apache.org/jira/browse/HBASE-12381
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Sean Busbey
Assignee: Sean Busbey
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.94.25, 0.99.2

 Attachments: HBASE-12381.1.patch.txt


 our ref guide says that you need maven 3 to build. add an enforcer rule so 
 that people find out early that they have the wrong maven version, rather 
 then however things fall over if someone tries to build with maven 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12424) Finer grained logging and metrics for split transactions

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202590#comment-14202590
 ] 

Hudson commented on HBASE-12424:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #630 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/630/])
HBASE-12424 Finer grained logging and metrics for split transactions (apurtell: 
rev 60fb3530364364202235b3c40bdf55ff1ea459a8)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServer.java


 Finer grained logging and metrics for split transactions
 

 Key: HBASE-12424
 URL: https://issues.apache.org/jira/browse/HBASE-12424
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 
 0001-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0002-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0003-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 HBASE-12424-0.98.patch, HBASE-12424.patch, HBASE-12424.patch, 
 HBASE-12424.patch, HowHBaseRegionSplitsareImplemented.pdf


 A split transaction is a complex orchestration of activity between the 
 RegionServer, Master, ZooKeeper, and HDFS NameNode. We have some visibility 
 into the time taken by various phases of the split transaction in the logs. 
 We will see Starting split of region $PARENT before the transaction begins, 
 before the parent is offlined. Later we will see Opening $DAUGHTER as one 
 of the last steps in the transaction, this is after the parent has been 
 flushed, offlined, and closed. Finally Region split, hbase:meta updated, 
 and report to master ... Split took $TIME after all steps are complete and 
 including the total running time of the transaction. 
 For debugging the cause(s) of long running split transactions it would be 
 useful to know the distribution of time spent in all of the phases of the 
 split transaction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12375) LoadIncrementalHFiles fails to load data in table when CF name starts with '_'

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12375:
---
Fix Version/s: (was: 0.98.9)
   0.98.8

 LoadIncrementalHFiles fails to load data in table when CF name starts with '_'
 --

 Key: HBASE-12375
 URL: https://issues.apache.org/jira/browse/HBASE-12375
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.5
Reporter: Ashish Singhi
Assignee: Ashish Singhi
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12375-0.98.patch, HBASE-12375-v2.patch, 
 HBASE-12375.patch


 We do not restrict user from creating a table having column family starting 
 with '_'.
 So when user creates a table in such a way then LoadIncrementalHFiles will 
 skip those family data to load into the table.
 {code}
 // Skip _logs, etc
 if (familyDir.getName().startsWith(_)) continue;
 {code}
 I think we should remove that check as I do not see any _logs directory being 
 created by the bulkload tool in the output directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12376) HBaseAdmin leaks ZK connections if failure starting watchers (ConnectionLossException)

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12376:
---
Fix Version/s: (was: 0.98.9)
   0.98.8

 HBaseAdmin leaks ZK connections if failure starting watchers 
 (ConnectionLossException)
 --

 Key: HBASE-12376
 URL: https://issues.apache.org/jira/browse/HBASE-12376
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.98.7, 0.94.24
Reporter: stack
Assignee: stack
Priority: Critical
 Fix For: 0.98.8, 0.94.25

 Attachments: 
 0001-12376-HBaseAdmin-leaks-ZK-connections-if-failure-sta.patch, 
 0001-12376-HBaseAdmin-leaks-ZK-connections-if-failure-sta.version2.patch


 This is a 0.98 issue that some users have been running into mostly running 
 Canary and for whatever reason, setup of zk connection fails, usually with a 
 ConnectionLossException.  End result is ugly leak zk connections.  ZKWatcher 
 created instances are just left hang out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202596#comment-14202596
 ] 

Andrew Purtell commented on HBASE-12432:


I'm going to commit this momentarily

 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12432:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Pushed to 0.98+

 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202614#comment-14202614
 ] 

Nick Dimiduk commented on HBASE-12432:
--

Ran out of time yesterday and I'm still catching up with today. Thanks Andrew.

 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Nick Dimiduk (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202626#comment-14202626
 ] 

Nick Dimiduk commented on HBASE-12432:
--

While I'm at it, both this and AsyncProcess should be emitting this log at 
debug, not info. We should fold that into Sean's work.

 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12441) Export and CopyTable need to be able to keep tags/labels in cells

2014-11-07 Thread Jerry He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jerry He updated HBASE-12441:
-
Issue Type: Improvement  (was: Bug)

 Export and CopyTable need to be able to keep tags/labels in cells
 -

 Key: HBASE-12441
 URL: https://issues.apache.org/jira/browse/HBASE-12441
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce, security
Affects Versions: 0.98.7, 0.99.3
Reporter: Jerry He

 Export and CopyTable (and possibly other MR tools) currently do not carry 
 over tags/labels in cells.
 These tools should be able to keep tags/labels in cells when they back up the 
 table cells.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12439) Procedure V2

2014-11-07 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202666#comment-14202666
 ] 

stack commented on HBASE-12439:
---

Doc is great. When you have a chance, a few examples would help.

 Procedure V2
 

 Key: HBASE-12439
 URL: https://issues.apache.org/jira/browse/HBASE-12439
 Project: HBase
  Issue Type: New Feature
  Components: master
Affects Versions: 2.0.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: ProcedureV2.pdf


 Procedure v2 (aka Notification Bus) aims to provide a unified way to build:
 * multi-steps procedure with a rollback/rollforward ability in case of 
 failure (e.g. create/delete table)
 ** HBASE-12070
 * notifications across multiple machines (e.g. ACLs/Labels/Quotas cache 
 updates)
 ** Make sure that every machine has the grant/revoke/label
 ** Enforce space limit quota across the namespace
 ** HBASE-10295 eliminate permanent replication zk node
 * procedures across multiple machines (e.g. Snapshots)
 * coordinated long-running procedures (e.g. compactions, splits, ...)
 * Synchronous calls, with the ability to see the state/result in case of 
 failure.
 ** HBASE-11608 sync split
 still work in progress/initial prototype: https://reviews.apache.org/r/27703/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202680#comment-14202680
 ] 

Andrew Purtell commented on HBASE-12432:


Flushing pending work for 0.98 RC tonight 

 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-5162) Basic client pushback mechanism

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202694#comment-14202694
 ] 

Hadoop QA commented on HBASE-5162:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12680247/hbase-5162-trunk-v6.patch
  against trunk revision .
  ATTACHMENT ID: 12680247

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 7 new 
or modified tests.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.hadoop.hbase.http.TestHttpServerLifecycle.testStartedServerIsAlive(TestHttpServerLifecycle.java:71)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-rest.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-annotations.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11615//console

This message is automatically generated.

 Basic client pushback mechanism
 ---

 Key: HBASE-5162
 URL: https://issues.apache.org/jira/browse/HBASE-5162
 Project: HBase
  Issue Type: New Feature
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: Jesse Yates
 Fix For: 1.0.0

 Attachments: hbase-5162-trunk-v0.patch, hbase-5162-trunk-v1.patch, 
 hbase-5162-trunk-v2.patch, hbase-5162-trunk-v3.patch, 
 hbase-5162-trunk-v4.patch, hbase-5162-trunk-v5.patch, 
 hbase-5162-trunk-v6.patch, java_HBASE-5162.patch


 The current blocking we do when we are close to some limits (memstores over 
 the multiplier factor, too many store files, global memstore memory) is bad, 
 too coarse and confusing. After hitting HBASE-5161, it really becomes obvious 
 that we need something better.
 I did a little brainstorm with Stack, we came up quickly with two solutions:
  - Send some exception to the client, like OverloadedException, that's thrown 
 when some situation happens like getting past the low memory barrier. It 
 would be thrown when the client gets a handler and does some check while 
 putting or deleting. The client would treat this a retryable exception but 
 

[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202703#comment-14202703
 ] 

Hudson commented on HBASE-12432:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #631 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/631/])
HBASE-12432 RpcRetryingCaller should log after fixed number of retries like 
AsyncProcess (apurtell: rev 2d9bb9d340eeef468f74500209ea2324d5988bb8)
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerFactory.java
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java


 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12424) Finer grained logging and metrics for split transactions

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202725#comment-14202725
 ] 

Hudson commented on HBASE-12424:


FAILURE: Integrated in HBase-1.0 #444 (See 
[https://builds.apache.org/job/HBase-1.0/444/])
HBASE-12424 Finer grained logging and metrics for split transactions (apurtell: 
rev 3eed03268ff73fb67a674bcab6102d3224d44316)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java


 Finer grained logging and metrics for split transactions
 

 Key: HBASE-12424
 URL: https://issues.apache.org/jira/browse/HBASE-12424
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 
 0001-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0002-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0003-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 HBASE-12424-0.98.patch, HBASE-12424.patch, HBASE-12424.patch, 
 HBASE-12424.patch, HowHBaseRegionSplitsareImplemented.pdf


 A split transaction is a complex orchestration of activity between the 
 RegionServer, Master, ZooKeeper, and HDFS NameNode. We have some visibility 
 into the time taken by various phases of the split transaction in the logs. 
 We will see Starting split of region $PARENT before the transaction begins, 
 before the parent is offlined. Later we will see Opening $DAUGHTER as one 
 of the last steps in the transaction, this is after the parent has been 
 flushed, offlined, and closed. Finally Region split, hbase:meta updated, 
 and report to master ... Split took $TIME after all steps are complete and 
 including the total running time of the transaction. 
 For debugging the cause(s) of long running split transactions it would be 
 useful to know the distribution of time spent in all of the phases of the 
 split transaction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12424) Finer grained logging and metrics for split transactions

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202785#comment-14202785
 ] 

Hudson commented on HBASE-12424:


FAILURE: Integrated in HBase-0.98 #661 (See 
[https://builds.apache.org/job/HBase-0.98/661/])
HBASE-12424 Finer grained logging and metrics for split transactions (apurtell: 
rev 60fb3530364364202235b3c40bdf55ff1ea459a8)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java


 Finer grained logging and metrics for split transactions
 

 Key: HBASE-12424
 URL: https://issues.apache.org/jira/browse/HBASE-12424
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 
 0001-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0002-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0003-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 HBASE-12424-0.98.patch, HBASE-12424.patch, HBASE-12424.patch, 
 HBASE-12424.patch, HowHBaseRegionSplitsareImplemented.pdf


 A split transaction is a complex orchestration of activity between the 
 RegionServer, Master, ZooKeeper, and HDFS NameNode. We have some visibility 
 into the time taken by various phases of the split transaction in the logs. 
 We will see Starting split of region $PARENT before the transaction begins, 
 before the parent is offlined. Later we will see Opening $DAUGHTER as one 
 of the last steps in the transaction, this is after the parent has been 
 flushed, offlined, and closed. Finally Region split, hbase:meta updated, 
 and report to master ... Split took $TIME after all steps are complete and 
 including the total running time of the transaction. 
 For debugging the cause(s) of long running split transactions it would be 
 useful to know the distribution of time spent in all of the phases of the 
 split transaction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-07 Thread sri (JIRA)
sri created HBASE-12445:
---

 Summary: hbase is removing all remaining cells immediately after 
the cell marked with marker = KeyValue.Type.DeleteColumn via PUT
 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri


Code executed:
{code}
@Test
public void testHbasePutDeleteCell() throws Exception {
TableName tableName = TableName.valueOf(my_test);
Configuration configuration = HBaseConfiguration.create();
HTableInterface table = new HTable(configuration, tableName);
final String rowKey = 12345;
final byte[] familly = Bytes.toBytes(default);
// put one row
  Put put = new Put(rowKey);
  put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
  put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
  put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
  put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));
  table.put(put);
  // get row back and assert the values
  Get get = new Get(rowKey);
  Result result = table.get(get);
  assertTrue(Column A value should be a,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(A))).equals(a));
  assertTrue(Column B value should be b,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(B))).equals(b));
  assertTrue(Column C value should be c,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(C))).equals(c));
  assertTrue(Column D value should be d,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(D))).equals(d));
  // put the same row again with C column deleted
  put = new Put(rowKey);
  put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
  put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
  KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
  put.add(marker);
  put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
  table.put(put);

  // get row back and assert the values
  get = new Get(rowKey);
  result = table.get(get);
  assertTrue(Column A value should be a1,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(A))).equals(a1));
  assertTrue(Column B value should be b1,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(B))).equals(b1));
  assertTrue(Column C should not exist,
result.getValue(family, Bytes.toBytes(C)) == null);
  assertTrue(Column D value should be d1,
Bytes.toString(result.getValue(family, 
Bytes.toBytes(D))).equals(d1));
}
{code}

This assertion fails, the cell  D is also deleted




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202818#comment-14202818
 ] 

Hudson commented on HBASE-12432:


SUCCESS: Integrated in HBase-TRUNK #5754 (See 
[https://builds.apache.org/job/HBase-TRUNK/5754/])
HBASE-12432 RpcRetryingCaller should log after fixed number of retries like 
AsyncProcess (apurtell: rev fb1af86ee1700ca1e6817c0c988ec9d5da1215d2)
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestFastFailWithoutTestUtil.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerFactory.java


 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12424) Finer grained logging and metrics for split transactions

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202819#comment-14202819
 ] 

Hudson commented on HBASE-12424:


SUCCESS: Integrated in HBase-TRUNK #5754 (See 
[https://builds.apache.org/job/HBase-TRUNK/5754/])
HBASE-12424 Finer grained logging and metrics for split transactions (apurtell: 
rev 7718390703fb3c193ea58d5287250e14002e9852)
* 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSource.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServer.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionServerSourceImpl.java


 Finer grained logging and metrics for split transactions
 

 Key: HBASE-12424
 URL: https://issues.apache.org/jira/browse/HBASE-12424
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: 
 0001-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0002-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 0003-HBASE-12424-Finer-grained-logging-and-metrics-for-sp.patch, 
 HBASE-12424-0.98.patch, HBASE-12424.patch, HBASE-12424.patch, 
 HBASE-12424.patch, HowHBaseRegionSplitsareImplemented.pdf


 A split transaction is a complex orchestration of activity between the 
 RegionServer, Master, ZooKeeper, and HDFS NameNode. We have some visibility 
 into the time taken by various phases of the split transaction in the logs. 
 We will see Starting split of region $PARENT before the transaction begins, 
 before the parent is offlined. Later we will see Opening $DAUGHTER as one 
 of the last steps in the transaction, this is after the parent has been 
 flushed, offlined, and closed. Finally Region split, hbase:meta updated, 
 and report to master ... Split took $TIME after all steps are complete and 
 including the total running time of the transaction. 
 For debugging the cause(s) of long running split transactions it would be 
 useful to know the distribution of time spent in all of the phases of the 
 split transaction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-07 Thread sri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sri updated HBASE-12445:

Attachment: TestPutAfterDeleteColumn.java

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11788) hbase is not deleting the cell when a Put with a KeyValue, KeyValue.Type.Delete is submitted

2014-11-07 Thread sri (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202823#comment-14202823
 ] 

sri commented on HBASE-11788:
-

Andrew,
Created a new issue for this and linked to this bug.
HBASE-12445 hbase is removing all remaining cells immediately after the cell 
marked with marker = KeyValue.Type.DeleteColumn via PUT.

Thanks
Sri Bora

 hbase is not deleting the cell when a Put with a KeyValue, 
 KeyValue.Type.Delete is submitted
 

 Key: HBASE-11788
 URL: https://issues.apache.org/jira/browse/HBASE-11788
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.99.0, 0.96.1.1, 0.98.5, 2.0.0
 Environment: Cloudera CDH 5.1.x
Reporter: Cristian Armaselu
Assignee: Srikanth Srungarapu
 Fix For: 0.99.0, 2.0.0, 0.98.6

 Attachments: HBASE-11788-master.patch, HBASE-11788-master_v2.patch, 
 TestPutAfterDeleteColumn.java, TestPutWithDelete.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
 Put put = new Put(Bytes.toBytes(rowKey));
 put.add(familly, Bytes.toBytes(A), Bytes.toBytes(a));
 put.add(familly, Bytes.toBytes(B), Bytes.toBytes(b));
 put.add(familly, Bytes.toBytes(C), Bytes.toBytes(c));
 table.put(put);
 // get row back and assert the values
 Get get = new Get(Bytes.toBytes(rowKey));
 Result result = table.get(get);
 Assert.isTrue(Bytes.toString(result.getValue(familly, 
 Bytes.toBytes(A))).equals(a), Column A value should be a);
 Assert.isTrue(Bytes.toString(result.getValue(familly, 
 Bytes.toBytes(B))).equals(b), Column B value should be b);
 Assert.isTrue(Bytes.toString(result.getValue(familly, 
 Bytes.toBytes(C))).equals(c), Column C value should be c);
 // put the same row again with C column deleted
 put = new Put(Bytes.toBytes(rowKey));
 put.add(familly, Bytes.toBytes(A), Bytes.toBytes(a));
 put.add(familly, Bytes.toBytes(B), Bytes.toBytes(b));
 put.add(new KeyValue(Bytes.toBytes(rowKey), familly, 
 Bytes.toBytes(C), HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn));
 table.put(put);
 // get row back and assert the values
 get = new Get(Bytes.toBytes(rowKey));
 result = table.get(get);
 Assert.isTrue(Bytes.toString(result.getValue(familly, 
 Bytes.toBytes(A))).equals(a), Column A value should be a);
 Assert.isTrue(Bytes.toString(result.getValue(familly, 
 Bytes.toBytes(B))).equals(b), Column A value should be b);
 Assert.isTrue(result.getValue(familly, Bytes.toBytes(C)) == null, 
 Column C should not exists);
 }
 {code}
 This assertion fails, the cell is not deleted but rather the value is empty:
 {code}
 hbase(main):029:0 scan 'my_test'
 ROW   COLUMN+CELL 
   
   
  12345column=default:A, 
 timestamp=1408473082290, value=a  
 
  12345column=default:B, 
 timestamp=1408473082290, value=b  
 
  12345column=default:C, 
 timestamp=1408473082290, value=  
 {code}
 This behavior is different than previous 4.8.x Cloudera version and is 
 currently corrupting all hive queries involving is null or is not null 
 operators on the columns mapped to hbase



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202825#comment-14202825
 ] 

Niels Basjes commented on HBASE-12279:
--

Ehhh, does it or does it not increase the javac compiler warnings? Hickup in 
Jenkins? 
{code}-1 javac. The applied patch generated 108 javac compiler warnings (more 
than the trunk's current 102 warnings).
+1 javac. The applied patch does not increase the total number of javac 
compiler warnings.{code}

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12438) Add -Dsurefire.rerunFailingTestsCount=2 to patch build runs so flakies get rerun

2014-11-07 Thread Manukranth Kolloju (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202845#comment-14202845
 ] 

Manukranth Kolloju commented on HBASE-12438:


Does Hudson report that it has rerun the tests or it just gives us a blue build 
without any hint of failing tests?

 Add  -Dsurefire.rerunFailingTestsCount=2 to patch build runs so flakies get 
 rerun
 -

 Key: HBASE-12438
 URL: https://issues.apache.org/jira/browse/HBASE-12438
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: stack
Assignee: stack
 Fix For: 2.0.0

 Attachments: 12438.txt


 Tripped over this config today:
  -Dsurefire.rerunFailingTestsCount=
 I made a test fail, then pass, and I got this output:
 {code}
  Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Flakes: 1
 {code}
 Notice the 'Flakes' addition on the far-right.
 Let me enable this on hadoopqa builds. Hopefully will help make it so new 
 contribs are not frightened off by flakies thinking their patch the cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12438) Add -Dsurefire.rerunFailingTestsCount=2 to patch build runs so flakies get rerun

2014-11-07 Thread Dima Spivak (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202849#comment-14202849
 ] 

Dima Spivak commented on HBASE-12438:
-

Out of the box, Jenkins doesn't treat flakey tests in a special way, but [there 
is a 
plugin|https://wiki.jenkins-ci.org/display/JENKINS/Flaky+Test+Handler+Plugin] 
that can change this. Perhaps worth checking with our friends at b.o.a. to get 
this set up, [~stack]?

 Add  -Dsurefire.rerunFailingTestsCount=2 to patch build runs so flakies get 
 rerun
 -

 Key: HBASE-12438
 URL: https://issues.apache.org/jira/browse/HBASE-12438
 Project: HBase
  Issue Type: Task
  Components: test
Reporter: stack
Assignee: stack
 Fix For: 2.0.0

 Attachments: 12438.txt


 Tripped over this config today:
  -Dsurefire.rerunFailingTestsCount=
 I made a test fail, then pass, and I got this output:
 {code}
  Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Flakes: 1
 {code}
 Notice the 'Flakes' addition on the far-right.
 Let me enable this on hadoopqa builds. Hopefully will help make it so new 
 contribs are not frightened off by flakies thinking their patch the cause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12446) [list | abort] Compactions

2014-11-07 Thread Manukranth Kolloju (JIRA)
Manukranth Kolloju created HBASE-12446:
--

 Summary: [list | abort] Compactions
 Key: HBASE-12446
 URL: https://issues.apache.org/jira/browse/HBASE-12446
 Project: HBase
  Issue Type: New Feature
Affects Versions: 1.0.0
Reporter: Manukranth Kolloju
 Fix For: 1.0.0


In some cases, we would need to quickly reduce load on a server without killing 
it. Compactions is one of the critical processes which takes up a lot of CPU 
and disk IOPS. We should have a way to list compactions given the regionserver 
and abort compactions given regionserver and compaction id. And additionally 
abort all compactions. 

Pardon me if there was already a similar Jira, I'd be happy to merge this there.

The current code handles interrupts. We should be able to interrupt the thread 
that is performing the compaction and abort it from either the UI or from the 
command line. This Jira is targeted to expose an admin function to perform such 
a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12445) hbase is removing all remaining cells immediately after the cell marked with marker = KeyValue.Type.DeleteColumn via PUT

2014-11-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202870#comment-14202870
 ] 

Ted Yu commented on HBASE-12445:


Can you formulate the new test as a patch ?

 hbase is removing all remaining cells immediately after the cell marked with 
 marker = KeyValue.Type.DeleteColumn via PUT
 

 Key: HBASE-12445
 URL: https://issues.apache.org/jira/browse/HBASE-12445
 Project: HBase
  Issue Type: Bug
Reporter: sri
 Attachments: TestPutAfterDeleteColumn.java


 Code executed:
 {code}
 @Test
 public void testHbasePutDeleteCell() throws Exception {
 TableName tableName = TableName.valueOf(my_test);
 Configuration configuration = HBaseConfiguration.create();
 HTableInterface table = new HTable(configuration, tableName);
 final String rowKey = 12345;
 final byte[] familly = Bytes.toBytes(default);
 // put one row
   Put put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b));
   put.add(family, Bytes.toBytes(C), Bytes.toBytes(c));
   put.add(family, Bytes.toBytes(D), Bytes.toBytes(d));  
   table.put(put);
   // get row back and assert the values
   Get get = new Get(rowKey);
   Result result = table.get(get);
   assertTrue(Column A value should be a,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a));
   assertTrue(Column B value should be b,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b));
   assertTrue(Column C value should be c,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(C))).equals(c));
   assertTrue(Column D value should be d,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d));
   // put the same row again with C column deleted
   put = new Put(rowKey);
   put.add(family, Bytes.toBytes(A), Bytes.toBytes(a1));
   put.add(family, Bytes.toBytes(B), Bytes.toBytes(b1));
   KeyValue marker = new KeyValue(rowKey, family, Bytes.toBytes(C),
 HConstants.LATEST_TIMESTAMP, KeyValue.Type.DeleteColumn); 
   put.add(marker);
 put.add(family, Bytes.toBytes(D), Bytes.toBytes(d1));
   table.put(put);
   // get row back and assert the values
   get = new Get(rowKey);
   result = table.get(get);
   assertTrue(Column A value should be a1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(A))).equals(a1));
   assertTrue(Column B value should be b1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(B))).equals(b1));
   assertTrue(Column C should not exist,
 result.getValue(family, Bytes.toBytes(C)) == null);
 assertTrue(Column D value should be d1,
 Bytes.toString(result.getValue(family, 
 Bytes.toBytes(D))).equals(d1));
 }
 {code}
 This assertion fails, the cell  D is also deleted



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202878#comment-14202878
 ] 

Hudson commented on HBASE-12432:


SUCCESS: Integrated in HBase-1.0 #445 (See 
[https://builds.apache.org/job/HBase-1.0/445/])
HBASE-12432 RpcRetryingCaller should log after fixed number of retries like 
AsyncProcess (apurtell: rev df3ba6ea4b33962145803678d369c476b6ba5817)
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestFastFailWithoutTestUtil.java
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerFactory.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java


 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12447) Add support for setTimeRange for CopyTable, RowCounter and CellCounter

2014-11-07 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12447:
-

 Summary: Add support for setTimeRange for CopyTable, RowCounter 
and CellCounter
 Key: HBASE-12447
 URL: https://issues.apache.org/jira/browse/HBASE-12447
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HBASE-12447) Add support for setTimeRange for CopyTable, RowCounter and CellCounter

2014-11-07 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez reassigned HBASE-12447:
-

Assignee: Esteban Gutierrez

 Add support for setTimeRange for CopyTable, RowCounter and CellCounter
 --

 Key: HBASE-12447
 URL: https://issues.apache.org/jira/browse/HBASE-12447
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Minor

 It would be nice to copy a subset of data to a remote cluster based on time 
 range or just count the rows/cells also for a time range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-12448:
--

 Summary: Fix rate reporting in compaction progress DEBUG logging
 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2


HBASE-11702 introduced rate reporting at DEBUG level for long running 
compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202916#comment-14202916
 ] 

Andrew Purtell commented on HBASE-12440:


All o.a.h.h.master.** and o.a.h.h.regionserver.** tests pass on 0.98 and 
branch-1. TestAssignmentManagerOnCluster passes 10 out of 10 times on 0.98 and 
branch-1.

Going to push this to both branches shortly



 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.1

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12447) Add support for setTimeRange for RowCounter and CellCounter

2014-11-07 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez updated HBASE-12447:
--
Description: It would be nice to count the rows/cells also for a time 
range. Copy Table already supports that.  (was: It would be nice to copy a 
subset of data to a remote cluster based on time range or just count the 
rows/cells also for a time range.)
Summary: Add support for setTimeRange for RowCounter and CellCounter  
(was: Add support for setTimeRange for CopyTable, RowCounter and CellCounter)

 Add support for setTimeRange for RowCounter and CellCounter
 ---

 Key: HBASE-12447
 URL: https://issues.apache.org/jira/browse/HBASE-12447
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Minor

 It would be nice to count the rows/cells also for a time range. Copy Table 
 already supports that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12447) Add support for setTimeRange for CopyTable, RowCounter and CellCounter

2014-11-07 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez updated HBASE-12447:
--
Description: It would be nice to copy a subset of data to a remote cluster 
based on time range or just count the rows/cells also for a time range.

 Add support for setTimeRange for CopyTable, RowCounter and CellCounter
 --

 Key: HBASE-12447
 URL: https://issues.apache.org/jira/browse/HBASE-12447
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez
Priority: Minor

 It would be nice to copy a subset of data to a remote cluster based on time 
 range or just count the rows/cells also for a time range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12432) RpcRetryingCaller should log after fixed number of retries like AsyncProcess

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202944#comment-14202944
 ] 

Hudson commented on HBASE-12432:


FAILURE: Integrated in HBase-0.98 #662 (See 
[https://builds.apache.org/job/HBase-0.98/662/])
HBASE-12432 RpcRetryingCaller should log after fixed number of retries like 
AsyncProcess (apurtell: rev 2d9bb9d340eeef468f74500209ea2324d5988bb8)
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCaller.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerFactory.java


 RpcRetryingCaller should log after fixed number of retries like AsyncProcess
 

 Key: HBASE-12432
 URL: https://issues.apache.org/jira/browse/HBASE-12432
 Project: HBase
  Issue Type: Improvement
  Components: Client
Reporter: Nick Dimiduk
Assignee: Nick Dimiduk
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12432.00-0.98.patch, HBASE-12432.00.patch, 
 HBASE-12432.01-0.98.patch, HBASE-12432.01.patch


 Scanner retry is handled by RpcRetryingCaller. This is different from multi, 
 which is handled by AsyncProcess. AsyncProcess will start logging operation 
 status after hbase.client.start.log.errors.counter retries have been 
 attempted. Let's bring the same functionality over to Scanner path.
 Noticed this while debugging IntegrationTestMTTR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-12440.

   Resolution: Fixed
Fix Version/s: (was: 0.99.1)
   0.99.2
 Hadoop Flags: Reviewed

 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12449) Use the max timestamp of current or old cell's timestamp in HRegion.append()

2014-11-07 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-12449:
-

 Summary: Use the max timestamp of current or old cell's timestamp 
in HRegion.append()
 Key: HBASE-12449
 URL: https://issues.apache.org/jira/browse/HBASE-12449
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 0.98.8, 0.99.2


We have observed an issue in SLES clusters where the system timestamp regularly 
goes back in time. This happens frequently enough to cause test failures when 
LTT is used with updater. 

Everytime an mutation is performed, the updater creates a string in the form 
#column:mutation_type and appends it to the column mutate_info. 

It seems that when the test fails, it is always the case that the mutate_info 
information for that particular column reported is not there in the column 
mutate_info. However, according to the MultiThreadedUpdater source code, if a 
row gets updated, all the columns will be mutated. So if a row contains 15 
columns, all 15 should appear in mutate_info. 

When the test fails though, we get an exception like: 
{code}
2014-11-02 04:31:12,018 ERROR [HBaseReaderThread_7] util.MultiThreadedAction: 
Error checking data for key [b0485292cde20d8a76cca37410a9f115-23787], column 
family [test_cf], column [8], mutation [null]; value of length 818
{code}

For the same row, the mutate info DOES NOT contain columns 8 (and 9) while it 
should: 
{code}
 test_cf:mutate_info timestamp=1414902651388, 
value=#increment:1#0:0#1:0#10:3#11:0#12:3#13:0#14:0#15:0#16:2#2:3#3:0#4:2#5:3#6:0#7:0
 
{code}

Further debugging led to finding the root cause where It seems that on SUSE 
System.currentTimeMillis() can go back in time freely (especially when run in a 
virtualized env like EC2), and actually happens very frequently. 

This is from a debug log that was put in place: 
{code}
2014-11-04 01:16:05,025 INFO  
[B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765025/Put/mvcc=8239/#increment:1
2014-11-04 01:16:05,038 INFO  
[B.DefaultRpcServer.handler=19,queue=1,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765038/Put/mvcc=8255/#increment:1#0:3
2014-11-04 01:16:05,047 INFO  
[B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765047/Put/mvcc=8265/#increment:1#0:3#1:3
2014-11-04 01:16:05,057 INFO  
[B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765056/Put/mvcc=8274/#increment:1#0:3#1:3#10:2
2014-11-04 01:16:05,061 INFO  [B.DefaultRpcServer.handler=6,queue=0,port=60020] 
regionserver.MemStore: upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765061/Put/mvcc=8278/#increment:1#0:3#1:3#10:2#11:0
2014-11-04 01:16:05,070 INFO  
[B.DefaultRpcServer.handler=20,queue=2,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765070/Put/mvcc=8285/#increment:1#0:3#1:3#10:2#11:0#12:3
2014-11-04 01:16:05,076 INFO  [B.DefaultRpcServer.handler=3,queue=0,port=60020] 
regionserver.MemStore: upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765076/Put/mvcc=8289/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0
2014-11-04 01:16:05,084 INFO  [B.DefaultRpcServer.handler=2,queue=2,port=60020] 
regionserver.MemStore: upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765084/Put/mvcc=8293/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0
2014-11-04 01:16:05,090 INFO  [B.DefaultRpcServer.handler=7,queue=1,port=60020] 
regionserver.MemStore: upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765090/Put/mvcc=8297/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0
2014-11-04 01:16:05,097 INFO  [B.DefaultRpcServer.handler=0,queue=0,port=60020] 
regionserver.MemStore: upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765097/Put/mvcc=8301/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0
2014-11-04 01:16:05,100 INFO  
[B.DefaultRpcServer.handler=14,queue=2,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765100/Put/mvcc=8303/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0
2014-11-04 01:16:05,103 INFO  
[B.DefaultRpcServer.handler=11,queue=2,port=60020] regionserver.MemStore: 
upserting: 
193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765103/Put/mvcc=8305/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0#18:0
2014-11-04 01:16:05,110 INFO  

[jira] [Updated] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12448:
---
Attachment: HBASE-12448-0.98.patch

 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12448:
---
Attachment: HBASE-12448.patch

 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch, HBASE-12448.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12448:
---
Status: Patch Available  (was: Open)

 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch, HBASE-12448.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202964#comment-14202964
 ] 

Andrew Purtell commented on HBASE-12448:


I was trying to save a local variable before but messed up. Just add one for 
tracking bytes written for the compaction progress report, if DEBUG logging is 
enabled. Also use EnvironmentEdgeManager#currentTime instead of 
System#currentTimeMillis.

 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch, HBASE-12448.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Virag Kothari (JIRA)
Virag Kothari created HBASE-12450:
-

 Summary: Unbalance chaos monkey might kill all region servers 
without starting them back
 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor


UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
servers. But if the balance fails exception is thrown causing the region 
servers to not start. For me, the balance always kept on failing with socket 
timeout (default 1 min) as master runs one iteration of balance for 5 mins 
(default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12449) Use the max timestamp of current or old cell's timestamp in HRegion.append()

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12449:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Use the max timestamp of current or old cell's timestamp in HRegion.append()
 

 Key: HBASE-12449
 URL: https://issues.apache.org/jira/browse/HBASE-12449
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 0.98.9, 0.99.2


 We have observed an issue in SLES clusters where the system timestamp 
 regularly goes back in time. This happens frequently enough to cause test 
 failures when LTT is used with updater. 
 Everytime an mutation is performed, the updater creates a string in the form 
 #column:mutation_type and appends it to the column mutate_info. 
 It seems that when the test fails, it is always the case that the mutate_info 
 information for that particular column reported is not there in the column 
 mutate_info. However, according to the MultiThreadedUpdater source code, if a 
 row gets updated, all the columns will be mutated. So if a row contains 15 
 columns, all 15 should appear in mutate_info. 
 When the test fails though, we get an exception like: 
 {code}
 2014-11-02 04:31:12,018 ERROR [HBaseReaderThread_7] util.MultiThreadedAction: 
 Error checking data for key [b0485292cde20d8a76cca37410a9f115-23787], column 
 family [test_cf], column [8], mutation [null]; value of length 818
 {code}
 For the same row, the mutate info DOES NOT contain columns 8 (and 9) while it 
 should: 
 {code}
  test_cf:mutate_info timestamp=1414902651388, 
 value=#increment:1#0:0#1:0#10:3#11:0#12:3#13:0#14:0#15:0#16:2#2:3#3:0#4:2#5:3#6:0#7:0
  
 {code}
 Further debugging led to finding the root cause where It seems that on SUSE 
 System.currentTimeMillis() can go back in time freely (especially when run in 
 a virtualized env like EC2), and actually happens very frequently. 
 This is from a debug log that was put in place: 
 {code}
 2014-11-04 01:16:05,025 INFO  
 [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765025/Put/mvcc=8239/#increment:1
 2014-11-04 01:16:05,038 INFO  
 [B.DefaultRpcServer.handler=19,queue=1,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765038/Put/mvcc=8255/#increment:1#0:3
 2014-11-04 01:16:05,047 INFO  
 [B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765047/Put/mvcc=8265/#increment:1#0:3#1:3
 2014-11-04 01:16:05,057 INFO  
 [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765056/Put/mvcc=8274/#increment:1#0:3#1:3#10:2
 2014-11-04 01:16:05,061 INFO  
 [B.DefaultRpcServer.handler=6,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765061/Put/mvcc=8278/#increment:1#0:3#1:3#10:2#11:0
 2014-11-04 01:16:05,070 INFO  
 [B.DefaultRpcServer.handler=20,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765070/Put/mvcc=8285/#increment:1#0:3#1:3#10:2#11:0#12:3
 2014-11-04 01:16:05,076 INFO  
 [B.DefaultRpcServer.handler=3,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765076/Put/mvcc=8289/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0
 2014-11-04 01:16:05,084 INFO  
 [B.DefaultRpcServer.handler=2,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765084/Put/mvcc=8293/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0
 2014-11-04 01:16:05,090 INFO  
 [B.DefaultRpcServer.handler=7,queue=1,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765090/Put/mvcc=8297/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0
 2014-11-04 01:16:05,097 INFO  
 [B.DefaultRpcServer.handler=0,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765097/Put/mvcc=8301/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0
 2014-11-04 01:16:05,100 INFO  
 [B.DefaultRpcServer.handler=14,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765100/Put/mvcc=8303/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0#17:0
 2014-11-04 

[jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12450:
--
Attachment: HBASE-12450.patch

Attached is patch for master which just logs a warning if the balance fails.
One unrelated log statement change

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12450:
--
Fix Version/s: 0.99.2
   0.98.8

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202982#comment-14202982
 ] 

Andrew Purtell commented on HBASE-12450:


+1

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12450:
---
Fix Version/s: 2.0.0

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12346) Scan's default auths behavior under Visibility labels

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12346:
---
Fix Version/s: (was: 0.98.8)
   0.98.9
   2.0.0

 Scan's default auths behavior under Visibility labels
 -

 Key: HBASE-12346
 URL: https://issues.apache.org/jira/browse/HBASE-12346
 Project: HBase
  Issue Type: Bug
  Components: API, security
Affects Versions: 0.98.7, 0.99.1
Reporter: Jerry He
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12346-master-v2.patch, 
 HBASE-12346-master-v3.patch, HBASE-12346-master.patch


 In Visibility Labels security, a set of labels (auths) are administered and 
 associated with a user.
 A user can normally  only see cell data during scan that are part of the 
 user's label set (auths).
 Scan uses setAuthorizations to indicates its wants to use the auths to access 
 the cells.
 Similarly in the shell:
 {code}
 scan 'table1', AUTHORIZATIONS = ['private']
 {code}
 But it is a surprise to find that setAuthorizations seems to be 'mandatory' 
 in the default visibility label security setting.  Every scan needs to 
 setAuthorizations before the scan can get any cells even the cells are under 
 the labels the request user is part of.
 The following steps will illustrate the issue:
 Run as superuser.
 {code}
 1. create a visibility label called 'private'
 2. create 'table1'
 3. put into 'table1' data and label the data as 'private'
 4. set_auths 'user1', 'private'
 5. grant 'user1', 'RW', 'table1'
 {code}
 Run as 'user1':
 {code}
 1. scan 'table1'
 This show no cells.
 2. scan 'table1', scan 'table1', AUTHORIZATIONS = ['private']
 This will show all the data.
 {code}
 I am not sure if this is expected by design or a bug.
 But a more reasonable, more client application backward compatible, and less 
 surprising default behavior should probably look like this:
 A scan's default auths, if its Authorizations attributes is not set 
 explicitly, should be all the auths the request user is administered and 
 allowed on the server.
 If scan.setAuthorizations is used, then the server further filter the auths 
 during scan: use the input auths minus what is not in user's label set on the 
 server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12431) Use of getColumnLatestCell(byte[], int, int, byte[], int, int) is Not Thread Safe

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12431:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Use of getColumnLatestCell(byte[], int, int, byte[], int, int) is Not Thread 
 Safe
 -

 Key: HBASE-12431
 URL: https://issues.apache.org/jira/browse/HBASE-12431
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.98.1
Reporter: Jonathan Jarvis
Assignee: Jingcheng Du
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12431-V2.diff, HBASE-12431-V3.diff, 
 HBASE-12431.diff


 Result declares that it is NOT THREAD SAFE at the top of the source code, but 
 one would assume that refers to many different threads accessing the same 
 Result object. I've run into an issue when I have several different threads 
 accessing their own Result object that runs into an issue because of use of 
 common static member variable.
 I noticed the problem when I switched from:
 getColumnLatestCell(byte[], byte[]) to
 getColumnLatestCell(byte[], int, int, byte[], int, int)
 These methods call different binarySearch methods, the latter invoking:
 protected int  binarySearch(final Cell [] kvs,
 309  final byte [] family, final int foffset, final int flength,
 310  final byte [] qualifier, final int qoffset, final int qlength) {
 This method utilizes a private static member variable called buffer
 If more than one thread is utilizing buffer you'll see unpredictable 
 behavior unless you synchronize(Result.class) {}.
 If buffer is to remain a static variable, I would recommend changing it to a 
 ThreadLocalbyte[] instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12425) Document the phases of the split transaction

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12425:
---
Fix Version/s: (was: 0.99.2)
   (was: 0.98.8)

 Document the phases of the split transaction
 

 Key: HBASE-12425
 URL: https://issues.apache.org/jira/browse/HBASE-12425
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Reporter: Andrew Purtell
Assignee: Misty Stanley-Jones
 Fix For: 2.0.0


 See PDF document attached to parent issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12319) Inconsistencies during region recovery due to close/open of a region during recovery

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12319:
---
Fix Version/s: (was: 0.98.8)
   0.98.9
   2.0.0

 Inconsistencies during region recovery due to close/open of a region during 
 recovery
 

 Key: HBASE-12319
 URL: https://issues.apache.org/jira/browse/HBASE-12319
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.7, 0.99.1
Reporter: Devaraj Das
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12319.patch


 In one of my test runs, I saw the following:
 {noformat}
 2014-10-14 13:45:30,782 DEBUG 
 [StoreOpener-51af4bd23dc32a940ad2dd5435f00e1d-1] regionserver.HStore: loaded 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/test_cf/d6df5cfe15ca41d68c619489fbde4d04,
  isReference=false, isBulkLoadResult=false, seqid=141197, majorCompaction=true
 2014-10-14 13:45:30,788 DEBUG [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Found 3 recovered edits file(s) under 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d
 .
 .
 2014-10-14 13:45:31,916 WARN  [RS_OPEN_REGION-hor9n01:60020-1] 
 regionserver.HRegion: Null or non-existent edits file: 
 hdfs://hor9n01.gq1.ygridcore.net:8020/apps/hbase/data/data/default/IntegrationTestIngest/51af4bd23dc32a940ad2dd5435f00e1d/recovered.edits/0198080
 {noformat}
 The above logs is from a regionserver, say RS2. From the initial analysis it 
 seemed like the master asked a certain regionserver to open the region (let's 
 say RS1) and for some reason asked it to close soon after. The open was still 
 proceeding on RS1 but the master reassigned the region to RS2. This also 
 started the recovery but it ended up seeing an inconsistent view of the 
 recovered-edits files (it reports missing files as per the logs above) since 
 the first regionserver (RS1) deleted some files after it completed the 
 recovery. When RS2 really opens the region, it might not see the recent data 
 that was written by flushes on hor9n10 during the recovery process. Reads of 
 that data would have inconsistencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202990#comment-14202990
 ] 

Lars Hofhansl commented on HBASE-12448:
---

Can we write more than 2gb bytes in one minute?
That'd just be 35.8mb/s, so I guess the answer is yes. So 
bytesWrittenInProgress should be a long.


 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch, HBASE-12448.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12450:
--
Status: Patch Available  (was: Open)

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12450-0.98.patch, HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12450:
--
Attachment: HBASE-12450-0.98.patch

Thanks for the quick review Andrew.
Attached is patch for 0.98. The patch for master is cleanly applying to branch-1


 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12450-0.98.patch, HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202993#comment-14202993
 ] 

Andrew Purtell commented on HBASE-12279:


Running the commands listed by [~nielsbasjes] above and committing to 0.98+ now.

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12279:
---
Assignee: Niels Basjes

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202994#comment-14202994
 ] 

Hudson commented on HBASE-12440:


FAILURE: Integrated in HBase-0.98-on-Hadoop-1.1 #632 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/632/])
HBASE-12440 Region may remain offline on clean startup under certain race 
condition (Virag Kothari) (apurtell: rev 
d2eb3cf3fa4897333f08dc87e6b830cca5d375ad)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java


 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203001#comment-14203001
 ] 

Hadoop QA commented on HBASE-12450:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12680323/HBASE-12450-0.98.patch
  against trunk revision .
  ATTACHMENT ID: 12680323

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/11617//console

This message is automatically generated.

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12450-0.98.patch, HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202993#comment-14202993
 ] 

Andrew Purtell edited comment on HBASE-12279 at 11/8/14 12:22 AM:
--

Running the commands listed by [~nielsbasjes] above and committing to 0.94+ now.


was (Author: apurtell):
Running the commands listed by [~nielsbasjes] above and committing to 0.98+ now.

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12448:
---
Status: Open  (was: Patch Available)

 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch, HBASE-12448.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12448) Fix rate reporting in compaction progress DEBUG logging

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12448:
---
Attachment: HBASE-12448.patch
HBASE-12448-0.98.patch

Updated patches.

Made 'bytesWritten' a long too.

 Fix rate reporting in compaction progress DEBUG logging
 ---

 Key: HBASE-12448
 URL: https://issues.apache.org/jira/browse/HBASE-12448
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12448-0.98.patch, HBASE-12448-0.98.patch, 
 HBASE-12448.patch, HBASE-12448.patch


 HBASE-11702 introduced rate reporting at DEBUG level for long running 
 compactions but failed to align bytesWritten with the reporting interval. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12449) Use the max timestamp of current or old cell's timestamp in HRegion.append()

2014-11-07 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-12449:
--
Attachment: hbase-12449.patch
hbase-12449-0.98.patch

Here is a simple patch which ensures that on append() the new cell's ts is the 
max of current time or old cell's time. If they are equal, the new cell will 
always sort first due to seqId being higher. 

 Use the max timestamp of current or old cell's timestamp in HRegion.append()
 

 Key: HBASE-12449
 URL: https://issues.apache.org/jira/browse/HBASE-12449
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: hbase-12449-0.98.patch, hbase-12449.patch


 We have observed an issue in SLES clusters where the system timestamp 
 regularly goes back in time. This happens frequently enough to cause test 
 failures when LTT is used with updater. 
 Everytime an mutation is performed, the updater creates a string in the form 
 #column:mutation_type and appends it to the column mutate_info. 
 It seems that when the test fails, it is always the case that the mutate_info 
 information for that particular column reported is not there in the column 
 mutate_info. However, according to the MultiThreadedUpdater source code, if a 
 row gets updated, all the columns will be mutated. So if a row contains 15 
 columns, all 15 should appear in mutate_info. 
 When the test fails though, we get an exception like: 
 {code}
 2014-11-02 04:31:12,018 ERROR [HBaseReaderThread_7] util.MultiThreadedAction: 
 Error checking data for key [b0485292cde20d8a76cca37410a9f115-23787], column 
 family [test_cf], column [8], mutation [null]; value of length 818
 {code}
 For the same row, the mutate info DOES NOT contain columns 8 (and 9) while it 
 should: 
 {code}
  test_cf:mutate_info timestamp=1414902651388, 
 value=#increment:1#0:0#1:0#10:3#11:0#12:3#13:0#14:0#15:0#16:2#2:3#3:0#4:2#5:3#6:0#7:0
  
 {code}
 Further debugging led to finding the root cause where It seems that on SUSE 
 System.currentTimeMillis() can go back in time freely (especially when run in 
 a virtualized env like EC2), and actually happens very frequently. 
 This is from a debug log that was put in place: 
 {code}
 2014-11-04 01:16:05,025 INFO  
 [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765025/Put/mvcc=8239/#increment:1
 2014-11-04 01:16:05,038 INFO  
 [B.DefaultRpcServer.handler=19,queue=1,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765038/Put/mvcc=8255/#increment:1#0:3
 2014-11-04 01:16:05,047 INFO  
 [B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765047/Put/mvcc=8265/#increment:1#0:3#1:3
 2014-11-04 01:16:05,057 INFO  
 [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765056/Put/mvcc=8274/#increment:1#0:3#1:3#10:2
 2014-11-04 01:16:05,061 INFO  
 [B.DefaultRpcServer.handler=6,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765061/Put/mvcc=8278/#increment:1#0:3#1:3#10:2#11:0
 2014-11-04 01:16:05,070 INFO  
 [B.DefaultRpcServer.handler=20,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765070/Put/mvcc=8285/#increment:1#0:3#1:3#10:2#11:0#12:3
 2014-11-04 01:16:05,076 INFO  
 [B.DefaultRpcServer.handler=3,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765076/Put/mvcc=8289/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0
 2014-11-04 01:16:05,084 INFO  
 [B.DefaultRpcServer.handler=2,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765084/Put/mvcc=8293/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0
 2014-11-04 01:16:05,090 INFO  
 [B.DefaultRpcServer.handler=7,queue=1,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765090/Put/mvcc=8297/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0
 2014-11-04 01:16:05,097 INFO  
 [B.DefaultRpcServer.handler=0,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765097/Put/mvcc=8301/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0
 

[jira] [Updated] (HBASE-12449) Use the max timestamp of current or old cell's timestamp in HRegion.append()

2014-11-07 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-12449:
--
Status: Patch Available  (was: Open)

 Use the max timestamp of current or old cell's timestamp in HRegion.append()
 

 Key: HBASE-12449
 URL: https://issues.apache.org/jira/browse/HBASE-12449
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: hbase-12449-0.98.patch, hbase-12449.patch


 We have observed an issue in SLES clusters where the system timestamp 
 regularly goes back in time. This happens frequently enough to cause test 
 failures when LTT is used with updater. 
 Everytime an mutation is performed, the updater creates a string in the form 
 #column:mutation_type and appends it to the column mutate_info. 
 It seems that when the test fails, it is always the case that the mutate_info 
 information for that particular column reported is not there in the column 
 mutate_info. However, according to the MultiThreadedUpdater source code, if a 
 row gets updated, all the columns will be mutated. So if a row contains 15 
 columns, all 15 should appear in mutate_info. 
 When the test fails though, we get an exception like: 
 {code}
 2014-11-02 04:31:12,018 ERROR [HBaseReaderThread_7] util.MultiThreadedAction: 
 Error checking data for key [b0485292cde20d8a76cca37410a9f115-23787], column 
 family [test_cf], column [8], mutation [null]; value of length 818
 {code}
 For the same row, the mutate info DOES NOT contain columns 8 (and 9) while it 
 should: 
 {code}
  test_cf:mutate_info timestamp=1414902651388, 
 value=#increment:1#0:0#1:0#10:3#11:0#12:3#13:0#14:0#15:0#16:2#2:3#3:0#4:2#5:3#6:0#7:0
  
 {code}
 Further debugging led to finding the root cause where It seems that on SUSE 
 System.currentTimeMillis() can go back in time freely (especially when run in 
 a virtualized env like EC2), and actually happens very frequently. 
 This is from a debug log that was put in place: 
 {code}
 2014-11-04 01:16:05,025 INFO  
 [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765025/Put/mvcc=8239/#increment:1
 2014-11-04 01:16:05,038 INFO  
 [B.DefaultRpcServer.handler=19,queue=1,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765038/Put/mvcc=8255/#increment:1#0:3
 2014-11-04 01:16:05,047 INFO  
 [B.DefaultRpcServer.handler=21,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765047/Put/mvcc=8265/#increment:1#0:3#1:3
 2014-11-04 01:16:05,057 INFO  
 [B.DefaultRpcServer.handler=27,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765056/Put/mvcc=8274/#increment:1#0:3#1:3#10:2
 2014-11-04 01:16:05,061 INFO  
 [B.DefaultRpcServer.handler=6,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765061/Put/mvcc=8278/#increment:1#0:3#1:3#10:2#11:0
 2014-11-04 01:16:05,070 INFO  
 [B.DefaultRpcServer.handler=20,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765070/Put/mvcc=8285/#increment:1#0:3#1:3#10:2#11:0#12:3
 2014-11-04 01:16:05,076 INFO  
 [B.DefaultRpcServer.handler=3,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765076/Put/mvcc=8289/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0
 2014-11-04 01:16:05,084 INFO  
 [B.DefaultRpcServer.handler=2,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765084/Put/mvcc=8293/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0
 2014-11-04 01:16:05,090 INFO  
 [B.DefaultRpcServer.handler=7,queue=1,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765090/Put/mvcc=8297/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0
 2014-11-04 01:16:05,097 INFO  
 [B.DefaultRpcServer.handler=0,queue=0,port=60020] regionserver.MemStore: 
 upserting: 
 193002e668758ea9762904da1a22337c-1268/test_cf:mutate_info/1415063765097/Put/mvcc=8301/#increment:1#0:3#1:3#10:2#11:0#12:3#13:0#14:0#15:0#16:0
 2014-11-04 01:16:05,100 INFO  
 [B.DefaultRpcServer.handler=14,queue=2,port=60020] regionserver.MemStore: 
 upserting: 
 

[jira] [Updated] (HBASE-12254) Document limitations related to pluggable replication endpoint feature usage in 0.98

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12254:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Document limitations related to pluggable replication endpoint feature usage 
 in 0.98
 

 Key: HBASE-12254
 URL: https://issues.apache.org/jira/browse/HBASE-12254
 Project: HBase
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 0.98.7
Reporter: ramkrishna.s.vasudevan
 Fix For: 0.98.9


 The pluggable Replication endpoint in 0.98 will need to be documented as how 
 exactly it can be used because of limitations that we may have due to mixed 
 version compatability where the peers may be in an older version of 0.98 
 where pluggable replication endpoint is not there. 
 Also this feature adds some more data to the znodes like the name of the 
 Endpoint impl, data and the Replication config.  A peer cluster with the 
 older version will not be able to read this data particularly when there is a 
 custom replication configured.  This JIRA aims at documenting such cases for 
 the ease of user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12450) Unbalance chaos monkey might kill all region servers without starting them back

2014-11-07 Thread Virag Kothari (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virag Kothari updated HBASE-12450:
--
Attachment: HBASE-12450.patch

Reattaching master patch as precommit ran against 0.98 patch

 Unbalance chaos monkey might kill all region servers without starting them 
 back
 ---

 Key: HBASE-12450
 URL: https://issues.apache.org/jira/browse/HBASE-12450
 Project: HBase
  Issue Type: Bug
Reporter: Virag Kothari
Assignee: Virag Kothari
Priority: Minor
 Fix For: 2.0.0, 0.98.8, 0.99.2

 Attachments: HBASE-12450-0.98.patch, HBASE-12450.patch, 
 HBASE-12450.patch


 UnbalanceKillAndRebalanceAction does kill, balance and then start of region 
 servers. But if the balance fails exception is thrown causing the region 
 servers to not start. For me, the balance always kept on failing with socket 
 timeout (default 1 min) as master runs one iteration of balance for 5 mins 
 (default config). Eventually all servers are killed but never started back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12168) Document Rest gateway SPNEGO-based authentication for client

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12168:
---
Fix Version/s: (was: 0.99.2)
   (was: 0.98.8)
   2.0.0

 Document Rest gateway SPNEGO-based authentication for client
 

 Key: HBASE-12168
 URL: https://issues.apache.org/jira/browse/HBASE-12168
 Project: HBase
  Issue Type: Task
  Components: documentation, REST, security
Reporter: Jerry He
 Fix For: 2.0.0


 After HBASE-5050, we seem to support SPNEGO-based authentication from client 
 on Rest gateway. But I had a tough time finding the info.
 The support is not mentioned in Security book. In the security book, we still 
 have:
 bq. It should be possible for clients to authenticate with the HBase cluster 
 through the REST gateway in a pass-through manner via SPEGNO HTTP 
 authentication. This is future work.
 The release note in HBASE-5050 seems to be obsolete as well. e.g.
 hbase.rest.kerberos.spnego.principal seems to be obsolete.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11979) Compaction progress reporting is wrong

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11979:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Compaction progress reporting is wrong
 --

 Key: HBASE-11979
 URL: https://issues.apache.org/jira/browse/HBASE-11979
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell
Assignee: Esteban Gutierrez
Priority: Minor
 Fix For: 2.0.0, 0.98.9, 0.99.2


 This is a long standing problem and previously could be observed in 
 regionserver metrics, but, we recently added logging for long running 
 compactions, and this has exposed the issue in a new way, e.g.
 {noformat}
 2014-09-15 14:20:59,450 DEBUG 
 [regionserver8120-largeCompactions-1410813534627]
 compactions.Compactor: Compaction progress: 22683625/6808179 (333.18%), 
 rate=162.08 kB/sec
 {noformat}
 The 'rate' reported in such logging is consistent and what we were really 
 after, but the progress indication is clearly broken and should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12148) Remove TimeRangeTracker as point of contention when many threads writing a Store

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12148:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Remove TimeRangeTracker as point of contention when many threads writing a 
 Store
 

 Key: HBASE-12148
 URL: https://issues.apache.org/jira/browse/HBASE-12148
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Affects Versions: 2.0.0, 0.99.1
Reporter: stack
Assignee: stack
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: 
 0001-In-AtomicUtils-change-updateMin-and-updateMax-to-ret.patch, 
 12148.addendum.txt, 12148.txt, 12148.txt, 12148v2.txt, 12148v2.txt, Screen 
 Shot 2014-10-01 at 3.39.46 PM.png, Screen Shot 2014-10-01 at 3.41.07 PM.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12442) Bring KeyValue#createFirstOnRow() back to branch-1 as deprecated methods

2014-11-07 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203059#comment-14203059
 ] 

Enis Soztutar commented on HBASE-12442:
---

+1 if Phoenix does not compile with 0.99. 

 Bring KeyValue#createFirstOnRow() back to branch-1 as deprecated methods
 

 Key: HBASE-12442
 URL: https://issues.apache.org/jira/browse/HBASE-12442
 Project: HBase
  Issue Type: Task
Affects Versions: 0.99.0
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.99.2

 Attachments: 12442-v1.patch, 12442-v2.patch


 KeyValue.createFirstOnRow() methods are used by downstream projects such as 
 Phoenix.
 They haven't been deprecated in 0.98 branch.
 This JIRA brings KeyValue.createFirstOnRow() back to branch as deprecated 
 methods. They are removed in master branch (hbase 2.0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-10919) [VisibilityController] ScanLabelGenerator using LDAP

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-10919:
---
Fix Version/s: (was: 0.98.8)
   0.98.9
   2.0.0

 [VisibilityController] ScanLabelGenerator using LDAP
 

 Key: HBASE-10919
 URL: https://issues.apache.org/jira/browse/HBASE-10919
 Project: HBase
  Issue Type: New Feature
Reporter: Andrew Purtell
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: slides-10919.pdf


 A ScanLabelGenerator that queries an external service, using the LDAP 
 protocol, for a set of attributes corresponding to the principal represented 
 by the request UGI, and converts any returned in the response to additional 
 auths in the effective set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11639) [Visibility controller] Replicate the visibility of Cells as strings

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11639:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 [Visibility controller] Replicate the visibility of Cells as strings
 

 Key: HBASE-11639
 URL: https://issues.apache.org/jira/browse/HBASE-11639
 Project: HBase
  Issue Type: Improvement
  Components: Replication, security
Affects Versions: 0.98.4
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
  Labels: VisibilityLabels
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-11639_v2.patch, HBASE-11639_v2.patch, 
 HBASE-11639_v3.patch, HBASE-11639_v3.patch, HBASE-11639_v5.patch


 This issue is aimed at persisting the visibility labels as strings in the WAL 
 rather than Label ordinals.  This would help in replicating the label 
 ordinals to the replication cluster as strings directly and also that after 
 HBASE-11553 would help because the replication cluster could have an 
 implementation as string based visibility labels.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12128) Cache configuration and RpcController selection for Table in Connection

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12128:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Cache configuration and RpcController selection for Table in Connection
 ---

 Key: HBASE-12128
 URL: https://issues.apache.org/jira/browse/HBASE-12128
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
 Fix For: 2.0.0, 0.98.9, 0.99.2


 Creating Table instances should be lightweight. Apps that manage their own 
 Connections are expected to create Tables on demand for each interaction. 
 However we look up values from Hadoop Configuration when constructing Table 
 objects for storing to some of its fields. Configuration is a heavyweight 
 registry that does a lot of string operations and regex matching. Method 
 calls into Configuration account for 48.25% of CPU time when creating the 
 HTable object in 0.98. Another ~48% of CPU is spent constructing the desired 
 RpcController object via reflection in 0.98. Together this can account for 
 ~20% of total on-CPU time of the client. See parent issue for more detail.
 We are using Connection like a factory for Table. We should cache 
 configuration for Table in Connection. We should also create by reflection 
 once and cache the desired RpcController object, and clone it for new Tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12279) Generated thrift files were generated with the wrong parameters

2014-11-07 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203080#comment-14203080
 ] 

Andrew Purtell commented on HBASE-12279:


'mvn generate-sources -Pcompile-thrift' works for 0.98 and higher. We are 
missing this for 0.94. I regenerated files for 0.94 using version 0.8.0 of the 
compiler by hand. 

Regenerated Thrift for 0.98+ with compiler Thrift version 0.9.0

*0.98 tests*

{noformat}
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.372 sec
Running org.apache.hadoop.hbase.thrift.TestThriftServer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 58.219 sec
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.955 sec
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.931 sec

Results :

Tests run: 59, Failures: 0, Errors: 0, Skipped: 0
{noformat}

*branch-1 tests*

{noformat}
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 60.84 sec - in 
org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Running org.apache.hadoop.hbase.thrift.TestThriftServer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 51.037 sec - in 
org.apache.hadoop.hbase.thrift.TestThriftServer
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.799 sec - 
in org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.776 sec - in 
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels

Results :

Tests run: 60, Failures: 0, Errors: 0, Skipped: 0
{noformat}

*master tests*

{noformat}
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.82 sec - in 
org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Running org.apache.hadoop.hbase.thrift.TestThriftServer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.724 sec - in 
org.apache.hadoop.hbase.thrift.TestThriftServer
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.98 sec - in 
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.261 sec - in 
org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandlerWithLabels

Results :

Tests run: 60, Failures: 0, Errors: 0, Skipped: 0

{noformat}

Regenerated Thrift for 0.94 with compiler Thrift version 0.8.0. Built this 
version of the compiler from the Thrift 0.8.0 distribution tarball downloaded 
from archive.apache.org. 

*0.94 tests*

{noformat}
Running org.apache.hadoop.hbase.thrift.TestThriftServerCmdLine
Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 317.191 sec
Running org.apache.hadoop.hbase.thrift.TestThriftServer
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.331 sec
Running org.apache.hadoop.hbase.thrift2.TestThriftHBaseServiceHandler
Tests run: 19, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.159 sec

Results :

Tests run: 40, Failures: 0, Errors: 0, Skipped: 0
{noformat}

Going to commit 0.98+ shortly unless objection.

Going to commit 0.94 over the weekend probably, ping [~lhofhansl]

 Generated thrift files were generated with the wrong parameters
 ---

 Key: HBASE-12279
 URL: https://issues.apache.org/jira/browse/HBASE-12279
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0, 0.98.0, 0.99.0
Reporter: Niels Basjes
Assignee: Niels Basjes
 Fix For: 2.0.0, 0.98.8, 0.94.26, 0.99.2

 Attachments: HBASE-12279-2014-10-16-v1.patch, 
 HBASE-12279-2014-11-07-v2.patch


 It turns out that the java code generated from the thrift files have been 
 generated with the wrong settings.
 Instead of the documented 
 ([thrift|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift/package-summary.html],
  
 [thrift2|http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/thrift2/package-summary.html])
  
 {code}
 thrift -strict --gen java:hashcode 
 {code}
 the current files seem to be generated instead with
 {code}
 thrift -strict --gen java
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11819) Unit test for CoprocessorHConnection

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11819:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Unit test for CoprocessorHConnection 
 -

 Key: HBASE-11819
 URL: https://issues.apache.org/jira/browse/HBASE-11819
 Project: HBase
  Issue Type: Test
Reporter: Andrew Purtell
Assignee: Talat UYARER
Priority: Minor
  Labels: newbie++
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-11819.patch, HBASE-11819v2.patch, 
 HBASE-11819v3.patch, HBASE-11819v4-0.98.patch, HBASE-11819v4-branch-1.patch, 
 HBASE-11819v4-master.patch, HBASE-11819v4-master.patch


 Add a unit test to hbase-server that exercises CoprocessorHConnection . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12173) Backport: [PE] Allow random value size

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12173:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Backport: [PE] Allow random value size
 --

 Key: HBASE-12173
 URL: https://issues.apache.org/jira/browse/HBASE-12173
 Project: HBase
  Issue Type: Sub-task
  Components: Performance
Reporter: Lars Hofhansl
 Fix For: 0.94.26, 0.98.9






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12053) SecurityBulkLoadEndPoint set 777 permission on input data files

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12053:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 SecurityBulkLoadEndPoint set 777 permission on input data files 
 

 Key: HBASE-12053
 URL: https://issues.apache.org/jira/browse/HBASE-12053
 Project: HBase
  Issue Type: Bug
Reporter: Jeffrey Zhong
Assignee: Jeffrey Zhong
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-12053.patch


 We have code in SecureBulkLoadEndpoint#secureBulkLoadHFiles
 {code}
   LOG.trace(Setting permission for:  + p);
   fs.setPermission(p, PERM_ALL_ACCESS);
 {code}
 This is against the point we use staging folder for secure bulk load. 
 Currently we create a hidden staging folder which has ALL_ACCESS permission 
 and we  use doAs to move input files into staging folder. Therefore, we 
 should not set 777 permission on the original input data files but files in 
 staging folder after move. 
 This may comprise security setting especially when there is an error  we 
 move the file with 777 permission back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-12440) Region may remain offline on clean startup under certain race condition

2014-11-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14203084#comment-14203084
 ] 

Hudson commented on HBASE-12440:


SUCCESS: Integrated in HBase-1.0 #446 (See 
[https://builds.apache.org/job/HBase-1.0/446/])
HBASE-12440 Region may remain offline on clean startup under certain race 
condition (Virag Kothari) (apurtell: rev 
87fb974765f4241026ef23f2abf4622ba372ffa9)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManagerOnCluster.java


 Region may remain offline on clean startup under certain race condition
 ---

 Key: HBASE-12440
 URL: https://issues.apache.org/jira/browse/HBASE-12440
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Reporter: Virag Kothari
Assignee: Virag Kothari
 Fix For: 0.98.8, 0.99.2

 Attachments: HBASE-12440-0.98.patch, HBASE-12440-0.98_v2.patch, 
 HBASE-12440-branch-1.patch


 Saw this in prod some time back with zk assignment
 On clean startup, while master was doing bulk assign while one of the region 
 servers dies. The bulk assigner then tried to assign it individually using 
 AssignCallable. The AssignCallable does a forceStateToOffline() and skips 
 assigning as it wants the SSH to do the assignment
 {code}
 2014-10-16 16:05:23,593 DEBUG master.AssignmentManager [AM.-pool1-t1] : 
 Offline 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  no need to unassign since it's on a dead server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 2014-10-16 16:05:23,593  INFO master.RegionStates [AM.-pool1-t1] : Transition 
 {1f1620174d2542fe7d5b034f3311c3a8 state=PENDING_OPEN, ts=1413475519482, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016} to 
 {1f1620174d2542fe7d5b034f3311c3a8 state=OFFLINE, ts=1413475523593, 
 server=gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016}
 2014-10-16 16:05:23,598  INFO master.AssignmentManager [AM.-pool1-t1] : Skip 
 assigning 
 sieve_main:inlinks,com.cbslocal.seattle/photo-galleries/category/consumer///:http\x09com.cbslocal.seattle/photo-galleries/category/tailgate-fan///:http,1413464068567.1f1620174d2542fe7d5b034f3311c3a8.,
  it is on a dead but not processed yet server: 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 But the SSH wont assign as the region is offline but not in transition
 {code}
 2014-10-16 16:05:24,606  INFO handler.ServerShutdownHandler 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Reassigning 0 region(s) that 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016 was carrying (and 0 
 regions(s) that were opening on this server)
 2014-10-16 16:05:24,606 DEBUG master.DeadServer 
 [MASTER_SERVER_OPERATIONS-hbbl874n38:50510-0] : Finished processing 
 gsbl872n06.blue.ygrid.yahoo.com,50511,1413475494016
 {code}
 In zk-less assignment, the bulk assigner invoking AssignCallable and the SSH 
 may try to assign the region. But as they go through lock, only one will 
 succeed and doesn't seem to be an issue. 
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12163) Move test annotation classes to the same package as in master

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12163:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Move test annotation classes to the same package as in master
 -

 Key: HBASE-12163
 URL: https://issues.apache.org/jira/browse/HBASE-12163
 Project: HBase
  Issue Type: Bug
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Priority: Trivial
 Fix For: 0.98.9, 0.99.2


 Test classe annotations (SmallTests, etc) are in different packages in master 
 vs 0.98 and branch-1 making backporting difficult. 
 Lets move them to the same package. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-12223) MultiTableInputFormatBase.getSplits is too slow

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-12223:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 MultiTableInputFormatBase.getSplits is too slow
 ---

 Key: HBASE-12223
 URL: https://issues.apache.org/jira/browse/HBASE-12223
 Project: HBase
  Issue Type: Improvement
  Components: Client
Affects Versions: 0.94.15
Reporter: shanwen
Assignee: YuanBo Peng
Priority: Minor
 Fix For: 2.0.0, 0.94.26, 0.98.9, 0.99.2

 Attachments: HBASE-12223.patch


 when use Multiple scan,getSplits is too slow,800 scans take five minutes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-11996) Add Table Creator to the HTD

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-11996:
---
Fix Version/s: (was: 0.98.8)
   0.98.9

 Add Table Creator to the HTD
 --

 Key: HBASE-11996
 URL: https://issues.apache.org/jira/browse/HBASE-11996
 Project: HBase
  Issue Type: New Feature
  Components: Admin, master, Operability
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Trivial
 Fix For: 2.0.0, 0.98.9, 0.99.2


 It will be nice storing the user who created the table. It is useful in 
 situations where you want to remove a table but you don't know who asking to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-11962) Port HBASE-11897 Add append and remove peer table-cfs cmds for replication to 0.98

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-11962.

   Resolution: Not a Problem
Fix Version/s: (was: 0.98.8)

No progress, resolving as NaP

 Port HBASE-11897 Add append and remove peer table-cfs cmds for replication 
 to 0.98
 

 Key: HBASE-11962
 URL: https://issues.apache.org/jira/browse/HBASE-11962
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Priority: Minor

 This issue is to backport the commands for appending and removing peer 
 table-cfs for replication to 0.98
 Two new commands, append_peer_tableCFs and remove_peer_tableCFs, are added to 
 do the operation of adding and removing a table/table-column family.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-9531) a command line (hbase shell) interface to retreive the replication metrics and show replication lag

2014-11-07 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-9531:
--
Fix Version/s: (was: 0.98.8)
   0.98.9

 a command line (hbase shell) interface to retreive the replication metrics 
 and show replication lag
 ---

 Key: HBASE-9531
 URL: https://issues.apache.org/jira/browse/HBASE-9531
 Project: HBase
  Issue Type: New Feature
  Components: Replication
Affects Versions: 0.99.0
Reporter: Demai Ni
Assignee: Demai Ni
 Fix For: 2.0.0, 0.98.9, 0.99.2

 Attachments: HBASE-9531-master-v1.patch, HBASE-9531-master-v1.patch, 
 HBASE-9531-master-v1.patch, HBASE-9531-master-v2.patch, 
 HBASE-9531-master-v3.patch, HBASE-9531-master-v4.patch, 
 HBASE-9531-trunk-v0.patch, HBASE-9531-trunk-v0.patch


 This jira is to provide a command line (hbase shell) interface to retreive 
 the replication metrics info such as:ageOfLastShippedOp, 
 timeStampsOfLastShippedOp, sizeOfLogQueue ageOfLastAppliedOp, and 
 timeStampsOfLastAppliedOp. And also to provide a point of time info of the 
 lag of replication(source only)
 Understand that hbase is using Hadoop 
 metrics(http://hbase.apache.org/metrics.html), which is a common way to 
 monitor metric info. This Jira is to serve as a light-weight client 
 interface, comparing to a completed(certainly better, but heavier)GUI 
 monitoring package. I made the code works on 0.94.9 now, and like to use this 
 jira to get opinions about whether the feature is valuable to other 
 users/workshop. If so, I will build a trunk patch. 
 All inputs are greatly appreciated. Thank you!
 The overall design is to reuse the existing logic which supports hbase shell 
 command 'status', and invent a new module, called ReplicationLoad.  In 
 HRegionServer.buildServerLoad() , use the local replication service objects 
 to get their loads  which could be wrapped in a ReplicationLoad object and 
 then simply pass it to the ServerLoad. In ReplicationSourceMetrics and 
 ReplicationSinkMetrics, a few getters and setters will be created, and ask 
 Replication to build a ReplicationLoad.  (many thanks to Jean-Daniel for 
 his kindly suggestions through dev email list)
 the replication lag will be calculated for source only, and use this formula: 
 {code:title=Replication lag|borderStyle=solid}
   if sizeOfLogQueue != 0 then max(ageOfLastShippedOp, (current time - 
 timeStampsOfLastShippedOp)) //err on the large side
   else if (current time - timeStampsOfLastShippedOp)  2* 
 ageOfLastShippedOp then lag = ageOfLastShippedOp // last shipped happen 
 recently 
 else lag = 0 // last shipped may happens last night, so NO real lag 
 although ageOfLastShippedOp is non-zero
 {code}
 External will look something like:
 {code:title=status 'replication'|borderStyle=solid}
 hbase(main):001:0 status 'replication'
 version 0.94.9
 3 live servers
     hdtest017.svl.ibm.com:
     SOURCE:PeerID=1, ageOfLastShippedOp=14, sizeOfLogQueue=0, 
 timeStampsOfLastShippedOp=Wed Sep 04 14:49:48 PDT 2013
     SINK  :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 
 14:48:48 PDT 2013
     hdtest018.svl.ibm.com:
     SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, 
 timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013
     SINK  :AgeOfLastAppliedOp=14, TimeStampsOfLastAppliedOp=Wed Sep 04 
 14:50:59 PDT 2013
     hdtest015.svl.ibm.com:
     SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, 
 timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013
     SINK  :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 
 14:48:48 PDT 2013
 hbase(main):002:0 status 'replication','source'
 version 0.94.9
 3 live servers
     hdtest017.svl.ibm.com:
     SOURCE:PeerID=1, ageOfLastShippedOp=14, sizeOfLogQueue=0, 
 timeStampsOfLastShippedOp=Wed Sep 04 14:49:48 PDT 2013
     hdtest018.svl.ibm.com:
     SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, 
 timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013
     hdtest015.svl.ibm.com:
     SOURCE:PeerID=1, ageOfLastShippedOp=0, sizeOfLogQueue=0, 
 timeStampsOfLastShippedOp=Wed Sep 04 14:48:48 PDT 2013
 hbase(main):003:0 status 'replication','sink'
 version 0.94.9
 3 live servers
     hdtest017.svl.ibm.com:
     SINK  :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 
 14:48:48 PDT 2013
     hdtest018.svl.ibm.com:
     SINK  :AgeOfLastAppliedOp=14, TimeStampsOfLastAppliedOp=Wed Sep 04 
 14:50:59 PDT 2013
     hdtest015.svl.ibm.com:
     SINK  :AgeOfLastAppliedOp=0, TimeStampsOfLastAppliedOp=Wed Sep 04 
 14:48:48 PDT 2013
 hbase(main):003:0 status 'replication','lag' 
 version 0.94.9
 3 live servers
     hdtest017.svl.ibm.com: lag = 0
     

  1   2   >