date:20180213

[jira] [Updated] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-13 Thread Gali Sheffi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gali Sheffi updated HBASE-19930:

Attachment: HBASE-19930-V06.patch

> fix ImmutableMemStoreLAB#forceCopyOfBigCellInto
> ---
>
> Key: HBASE-19930
> URL: https://issues.apache.org/jira/browse/HBASE-19930
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-beta-1
>Reporter: Gali Sheffi
>Assignee: Gali Sheffi
>Priority: Major
> Attachments: HBASE-19930-V01.patch, HBASE-19930-V02.patch, 
> HBASE-19930-V03.patch, HBASE-19930-V04.patch, HBASE-19930-V05.patch, 
> HBASE-19930-V06.patch
>
>
> This issue is about fixing ImmutableMemStoreLAB#forceCopyOfBigCellInto.
> Following a comment in HBASE-19133 regarding a bug in 
> ImmutableMemStoreLAB#forceCopyOfBigCellInto (assuming this method is never 
> called for an ImmutableMemStoreLAB, and just throwing an 
> IllegalStateException whenever called), the forceCopyOfBigCellInto method now 
> performs the copy of big cells on the first MSLABImpl in its mslabs 
> linked-list.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363587#comment-16363587
 ] 

Hadoop QA commented on HBASE-19116:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
30s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
59s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 1 new + 17 unchanged 
- 3 fixed = 18 total (was 20) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 4s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 24s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}108m 
17s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}138m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910505/HBASE-19116.branch-2.004.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 5f273f20e126 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 1f3c131371 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11517/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11517/testReport/ |
| Max. process+thread count | 5012 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11517/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This

[jira] [Updated] (HBASE-19863) java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter is used

2018-02-13 Thread Sergey Soldatov (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Soldatov updated HBASE-19863:

Attachment: HBASE-19863-branch-2.patch

> java.lang.IllegalStateException: isDelete failed when SingleColumnValueFilter 
> is used
> -
>
> Key: HBASE-19863
> URL: https://issues.apache.org/jira/browse/HBASE-19863
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 1.4.1
>Reporter: Sergey Soldatov
>Assignee: Sergey Soldatov
>Priority: Major
> Attachments: HBASE-19863-branch-2.patch, HBASE-19863-branch1.patch, 
> HBASE-19863-test.patch
>
>
> Under some circumstances scan with SingleColumnValueFilter may fail with an 
> exception
> {noformat} 
> java.lang.IllegalStateException: isDelete failed: deleteBuffer=C3, 
> qualifier=C2, timestamp=1516433595543, comparison result: 1 
> at 
> org.apache.hadoop.hbase.regionserver.ScanDeleteTracker.isDeleted(ScanDeleteTracker.java:149)
>   at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:386)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:545)
>   at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5876)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:6027)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5814)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2552)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32385)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2150)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:187)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:167)
> {noformat}
> Conditions:
> table T with a single column family 0 that uses ROWCOL bloom filter 
> (important)  and column qualifiers C1,C2,C3,C4,C5. 
> When we fill the table for every row we put deleted cell for C3.
> The table has a single region with two HStore:
> A: start row: 0, stop row: 99 
> B: start row: 10 stop row: 99
> B has newer versions of rows 10-99. Store files have several blocks each 
> (important). 
> Store A is the result of major compaction,  so it doesn't have any deleted 
> cells (important).
> So, we are running a scan like:
> {noformat}
> scan 'T', { COLUMNS => ['0:C3','0:C5'], FILTER => "SingleColumnValueFilter 
> ('0','C5',=,'binary:whatever')"}
> {noformat}  
> How the scan performs:
> First, we iterate A for rows 0 and 1 without any problems. 
> Next, we start to iterate A for row 10, so read the first cell and set hfs 
> scanner to A :
> 10:0/C1/0/Put/x but found that we have a newer version of the cell in B : 
> 10:0/C1/1/Put/x, 
> so we make B as our current store scanner. Since we are looking for 
> particular columns 
> C3 and C5, we perform the optimization StoreScanner.seekOrSkipToNextColumn 
> which 
> would run reseek for all store scanners.
> For store A the following magic would happen in requestSeek:
>   1. bloom filter check passesGeneralBloomFilter would set haveToSeek to 
> false because row 10 doesn't have C3 qualifier in store A.  
>   2. Since we don't have to seek we just create a fake row 
> 10:0/C3/OLDEST_TIMESTAMP/Maximum, an optimization that is quite important for 
> us and it commented with :
> {noformat}
>  // Multi-column Bloom filter optimization.
> // Create a fake key/value, so that this scanner only bubbles up to the 
> top
> // of the KeyValueHeap in StoreScanner after we scanned this row/column in
> // all other store files. The query matcher will then just skip this fake
> // key/value and the store scanner will progress to the next column. This
> // is obviously not a "real real" seek, but unlike the fake KV earlier in
> // this method, we want this to be propagated to ScanQueryMatcher.
> {noformat}
> 
> For store B we would set it to fake 10:0/C3/createFirstOnRowColTS()/Maximum 
> to skip C3 entirely. 
> After that we start searching for qualifier C5 using seekOrSkipToNextColumn 
> which run first trySkipToNextColumn:
> {noformat}
>   protected boolean trySkipToNextColumn(Cell cell) throws IOException {
> Cell nextCell = null;
> do {
>   Cell nextIndexedKey = getNextIndexedKey();
>   if (nextIndexedKey != null && nextIndexedKey != 
> KeyValueScanner.NO_NEXT_INDEXED_KEY
>

[jira] [Updated] (HBASE-19998) Flakey TestVisibilityLabelsWithDefaultVisLabelService

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19998:
--
Attachment: hbase-19988.master.001.patch

> Flakey TestVisibilityLabelsWithDefaultVisLabelService
> -
>
> Key: HBASE-19998
> URL: https://issues.apache.org/jira/browse/HBASE-19998
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch
>
>
> This is a good one. Its a timeout and though it has lots of test methods, the 
> problem is one of them gets stuck. The test method kills a RegionServers then 
> starts a new one. Usually all works out fine but the odd time there is this 
> unexplained MOVE that gets interjected just as ServerCrashProcedure starts 
> up. hbase:meta gets stuck (perhaps this is what is being referred to here: 
> https://issues.apache.org/jira/browse/HBASE-19929?focusedCommentId=16356906=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16356906).
>  It is trying to run the MOVE by first unassigning from the server that has 
> just crashed. It never succeeds. Need to fix this. Need to figure where these 
> Move operations are coming from too. Let me add some debug. 
> See here how we are well into ServerCrashProcedure... and then two MOVEs 
> cut-in... for hbase:meta and for namespace:
> {code}
> 
> 2018-02-14 02:35:19,806 DEBUG [PEWorker-6] 
> procedure.ServerCrashProcedure(192): pid=10, 
> state=RUNNABLE:SERVER_CRASH_PROCESS_META; ServerCrashProcedure 
> server=asf903.gq1.ygridcore.net,59608,1518575711969, splitWal=true, 
> meta=true; Processing hbase:meta that was on 
> asf903.gq1.ygridcore.net,59608,1518575711969
> 2018-02-14 02:35:19,807 INFO  [PEWorker-6] 
> procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=12, 
> ppid=10, state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure 
> failedMetaServer=asf903.gq1.ygridcore.net,59608,1518575711969, splitWal=true}]
> 2018-02-14 02:35:19,811 DEBUG [Thread-214] procedure2.ProcedureExecutor(868): 
> Stored pid=11, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:meta,,1.1588230740, 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination=
> 2018-02-14 02:35:19,813 INFO  [PEWorker-8] 
> procedure.MasterProcedureScheduler(813): pid=11, 
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:meta,,1.1588230740, 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination= hbase:meta 
> hbase:meta,,1.1588230740
> 2018-02-14 02:35:19,814 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=14, 
> ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:meta, region=1588230740, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969}]
> 2018-02-14 02:35:19,831 DEBUG [Thread-214] procedure2.ProcedureExecutor(868): 
> Stored pid=13, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15., 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination=
> 2018-02-14 02:35:19,833 INFO  [PEWorker-10] 
> procedure.MasterProcedureScheduler(813): pid=13, 
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15., 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination= 
> hbase:namespace 
> hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15.
> 2018-02-14 02:35:19,837 INFO  [PEWorker-10] 
> procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=15, 
> ppid=13, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:namespace, region=e52a160b3f3a57ab50d710eba62d9b15, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969}]
> 
> {code}
> Here is the failure of the unassign:
> {code}
> 2018-02-14 02:35:19,944 WARN  [PEWorker-8] 
> assignment.RegionTransitionProcedure(187): Remote call failed pid=14, 
> ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:meta, region=1588230740, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969; rit=CLOSING, 
> location=asf903.gq1.ygridcore.net,59608,1518575711969; exception=pid=14, 
> ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:meta, region=1588230740, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969 to 
> asf903.gq1.ygridcore.net,59608,1518575711969
> 2018-02-14 02:35:19,945 WARN  [PEWorker-8] assignment.UnassignProcedure(245): 
> Expiring server pid=14, ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> UnassignProcedure table=hbase:meta, region=1588230740, 
>

[jira] [Commented] (HBASE-19998) Flakey TestVisibilityLabelsWithDefaultVisLabelService

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363550#comment-16363550
 ] 

stack commented on HBASE-19998:
---

.001 is some debug I pushedto master and branch-2.

> Flakey TestVisibilityLabelsWithDefaultVisLabelService
> -
>
> Key: HBASE-19998
> URL: https://issues.apache.org/jira/browse/HBASE-19998
> Project: HBase
>  Issue Type: Bug
>  Components: flakey, test
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch
>
>
> This is a good one. Its a timeout and though it has lots of test methods, the 
> problem is one of them gets stuck. The test method kills a RegionServers then 
> starts a new one. Usually all works out fine but the odd time there is this 
> unexplained MOVE that gets interjected just as ServerCrashProcedure starts 
> up. hbase:meta gets stuck (perhaps this is what is being referred to here: 
> https://issues.apache.org/jira/browse/HBASE-19929?focusedCommentId=16356906=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16356906).
>  It is trying to run the MOVE by first unassigning from the server that has 
> just crashed. It never succeeds. Need to fix this. Need to figure where these 
> Move operations are coming from too. Let me add some debug. 
> See here how we are well into ServerCrashProcedure... and then two MOVEs 
> cut-in... for hbase:meta and for namespace:
> {code}
> 
> 2018-02-14 02:35:19,806 DEBUG [PEWorker-6] 
> procedure.ServerCrashProcedure(192): pid=10, 
> state=RUNNABLE:SERVER_CRASH_PROCESS_META; ServerCrashProcedure 
> server=asf903.gq1.ygridcore.net,59608,1518575711969, splitWal=true, 
> meta=true; Processing hbase:meta that was on 
> asf903.gq1.ygridcore.net,59608,1518575711969
> 2018-02-14 02:35:19,807 INFO  [PEWorker-6] 
> procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=12, 
> ppid=10, state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure 
> failedMetaServer=asf903.gq1.ygridcore.net,59608,1518575711969, splitWal=true}]
> 2018-02-14 02:35:19,811 DEBUG [Thread-214] procedure2.ProcedureExecutor(868): 
> Stored pid=11, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:meta,,1.1588230740, 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination=
> 2018-02-14 02:35:19,813 INFO  [PEWorker-8] 
> procedure.MasterProcedureScheduler(813): pid=11, 
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:meta,,1.1588230740, 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination= hbase:meta 
> hbase:meta,,1.1588230740
> 2018-02-14 02:35:19,814 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=14, 
> ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:meta, region=1588230740, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969}]
> 2018-02-14 02:35:19,831 DEBUG [Thread-214] procedure2.ProcedureExecutor(868): 
> Stored pid=13, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15., 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination=
> 2018-02-14 02:35:19,833 INFO  [PEWorker-10] 
> procedure.MasterProcedureScheduler(813): pid=13, 
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15., 
> source=asf903.gq1.ygridcore.net,59608,1518575711969, destination= 
> hbase:namespace 
> hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15.
> 2018-02-14 02:35:19,837 INFO  [PEWorker-10] 
> procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=15, 
> ppid=13, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:namespace, region=e52a160b3f3a57ab50d710eba62d9b15, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969}]
> 
> {code}
> Here is the failure of the unassign:
> {code}
> 2018-02-14 02:35:19,944 WARN  [PEWorker-8] 
> assignment.RegionTransitionProcedure(187): Remote call failed pid=14, 
> ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:meta, region=1588230740, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969; rit=CLOSING, 
> location=asf903.gq1.ygridcore.net,59608,1518575711969; exception=pid=14, 
> ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
> table=hbase:meta, region=1588230740, 
> server=asf903.gq1.ygridcore.net,59608,1518575711969 to 
> asf903.gq1.ygridcore.net,59608,1518575711969
> 2018-02-14 02:35:19,945 WARN  [PEWorker-8] assignment.UnassignProcedure(245): 
> Expiring server pid=14, ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
> UnassignProcedure

[jira] [Created] (HBASE-19998) Flakey TestVisibilityLabelsWithDefaultVisLabelService

2018-02-13 Thread stack (JIRA)

stack created HBASE-19998:
-

 Summary: Flakey TestVisibilityLabelsWithDefaultVisLabelService
 Key: HBASE-19998
 URL: https://issues.apache.org/jira/browse/HBASE-19998
 Project: HBase
  Issue Type: Bug
  Components: flakey, test
Reporter: stack
Assignee: stack
 Fix For: 2.0.0-beta-2


This is a good one. Its a timeout and though it has lots of test methods, the 
problem is one of them gets stuck. The test method kills a RegionServers then 
starts a new one. Usually all works out fine but the odd time there is this 
unexplained MOVE that gets interjected just as ServerCrashProcedure starts up. 
hbase:meta gets stuck (perhaps this is what is being referred to here: 
https://issues.apache.org/jira/browse/HBASE-19929?focusedCommentId=16356906=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16356906).
 It is trying to run the MOVE by first unassigning from the server that has 
just crashed. It never succeeds. Need to fix this. Need to figure where these 
Move operations are coming from too. Let me add some debug. 

See here how we are well into ServerCrashProcedure... and then two MOVEs 
cut-in... for hbase:meta and for namespace:

{code}

2018-02-14 02:35:19,806 DEBUG [PEWorker-6] procedure.ServerCrashProcedure(192): 
pid=10, state=RUNNABLE:SERVER_CRASH_PROCESS_META; ServerCrashProcedure 
server=asf903.gq1.ygridcore.net,59608,1518575711969, splitWal=true, meta=true; 
Processing hbase:meta that was on asf903.gq1.ygridcore.net,59608,1518575711969
2018-02-14 02:35:19,807 INFO  [PEWorker-6] procedure2.ProcedureExecutor(1498): 
Initialized subprocedures=[{pid=12, ppid=10, 
state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure 
failedMetaServer=asf903.gq1.ygridcore.net,59608,1518575711969, splitWal=true}]
2018-02-14 02:35:19,811 DEBUG [Thread-214] procedure2.ProcedureExecutor(868): 
Stored pid=11, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
hri=hbase:meta,,1.1588230740, 
source=asf903.gq1.ygridcore.net,59608,1518575711969, destination=
2018-02-14 02:35:19,813 INFO  [PEWorker-8] 
procedure.MasterProcedureScheduler(813): pid=11, 
state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
hri=hbase:meta,,1.1588230740, 
source=asf903.gq1.ygridcore.net,59608,1518575711969, destination= hbase:meta 
hbase:meta,,1.1588230740
2018-02-14 02:35:19,814 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1498): 
Initialized subprocedures=[{pid=14, ppid=11, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure table=hbase:meta, 
region=1588230740, server=asf903.gq1.ygridcore.net,59608,1518575711969}]
2018-02-14 02:35:19,831 DEBUG [Thread-214] procedure2.ProcedureExecutor(868): 
Stored pid=13, state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
hri=hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15., 
source=asf903.gq1.ygridcore.net,59608,1518575711969, destination=
2018-02-14 02:35:19,833 INFO  [PEWorker-10] 
procedure.MasterProcedureScheduler(813): pid=13, 
state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
hri=hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15., 
source=asf903.gq1.ygridcore.net,59608,1518575711969, destination= 
hbase:namespace hbase:namespace,,1518575716296.e52a160b3f3a57ab50d710eba62d9b15.
2018-02-14 02:35:19,837 INFO  [PEWorker-10] procedure2.ProcedureExecutor(1498): 
Initialized subprocedures=[{pid=15, ppid=13, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
table=hbase:namespace, region=e52a160b3f3a57ab50d710eba62d9b15, 
server=asf903.gq1.ygridcore.net,59608,1518575711969}]

{code}

Here is the failure of the unassign:

{code}
2018-02-14 02:35:19,944 WARN  [PEWorker-8] 
assignment.RegionTransitionProcedure(187): Remote call failed pid=14, ppid=11, 
state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure table=hbase:meta, 
region=1588230740, server=asf903.gq1.ygridcore.net,59608,1518575711969; 
rit=CLOSING, location=asf903.gq1.ygridcore.net,59608,1518575711969; 
exception=pid=14, ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
UnassignProcedure table=hbase:meta, region=1588230740, 
server=asf903.gq1.ygridcore.net,59608,1518575711969 to 
asf903.gq1.ygridcore.net,59608,1518575711969
2018-02-14 02:35:19,945 WARN  [PEWorker-8] assignment.UnassignProcedure(245): 
Expiring server pid=14, ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; 
UnassignProcedure table=hbase:meta, region=1588230740, 
server=asf903.gq1.ygridcore.net,59608,1518575711969; rit=CLOSING, 
location=asf903.gq1.ygridcore.net,59608,1518575711969, 
exception=org.apache.hadoop.hbase.master.assignment.FailedRemoteDispatchException:
 pid=14, ppid=11, state=RUNNABLE:REGION_TRANSITION_DISPATCH; UnassignProcedure 
table=hbase:meta, region=1588230740, 
server=asf903.gq1.ygridcore.net,59608,1518575711969 to 
asf903.gq1.ygridcore.net,59608,1518575711969
2018-02-14 02:35:19,945 WARN  [PEWorker-8]

[jira] [Commented] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363542#comment-16363542
 ] 

Pankaj Kumar commented on HBASE-19979:
--

Thanks Stack..!! 

> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.3.2, 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363525#comment-16363525
 ] 

Hudson commented on HBASE-19979:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #351 (See 
[https://builds.apache.org/job/HBase-1.3-IT/351/])
HBASE-19979 ReplicationSyncUp tool may leak Zookeeper connection (stack: rev 
0507413fe61d5a17229817e2d56d7603d037bde8)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSyncUp.java


> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.3.2, 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363514#comment-16363514
 ] 

stack commented on HBASE-19979:
---

Pushed to branch-1.3 too.

> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.3.2, 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19979:
--
Fix Version/s: 1.3.2

> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.3.2, 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363502#comment-16363502
 ] 

stack commented on HBASE-19116:
---

.004 forgot to update test.

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch, 
> HBASE-19116.branch-2.002.patch, HBASE-19116.branch-2.003.patch, 
> HBASE-19116.branch-2.004.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363497#comment-16363497
 ] 

Pankaj Kumar commented on HBASE-19979:
--

Thanks everyone for reviewing and committing this fix. 

Can we have this fix in branch-1.3.x as well?

> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19116:
--
Attachment: HBASE-19116.branch-2.004.patch

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch, 
> HBASE-19116.branch-2.002.patch, HBASE-19116.branch-2.003.patch, 
> HBASE-19116.branch-2.004.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363494#comment-16363494
 ] 

stack commented on HBASE-19965:
---

Pushed second addendum that breaks a TestAsyncTableAdminAPI3 out of 
TestAsyncTableAdminAPI.

> Fix flaky TestAsyncRegionAdminApi
> -
>
> Key: HBASE-19965
> URL: https://issues.apache.org/jira/browse/HBASE-19965
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19965.branch-2.001.patch
>
>
> See 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/]
>  
> java.lang.AssertionError: expected:<2> but was:<3> at 
> org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359)
>  
> Merge regions not work. The table still have 3 regions after the 
> MergeRegionsProcedure finished.
> The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because 
> the MergeRegionsProcedure pid=138 start work first, so the balance need wait 
> for the lock. But after merge regions finished, the MoveRegionProcedure 
> pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new 
> region server. This is not right. The MoveRegionProcedure should skip to 
> assign a region which was marked as offline. Or we should clear the merged 
> regions' procedure when MergeRegionsProcedure finished.
>  
> Logs:
> 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] 
> master.HMaster(1454): balance 
> hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., 
> source=cd4730e3eae2,39077,1518106776411, 
> destination=cd4730e3eae2,40578,1518106776318
> 2018-02-08 16:24:44,608 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] 
> procedure2.ProcedureExecutor(868): Stored pid=138, 
> state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure 
> table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, 
> 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false
> ..
> 2018-02-08 16:24:50,111 INFO [PEWorker-13] 
> procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; 
> MergeTableRegionsProcedure table=testMergeRegions, 
> regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], 
> forcibly=false in 5.5710sec
> 2018-02-08 16:24:50,113 INFO [PEWorker-13] 
> procedure.MasterProcedureScheduler(813): pid=139, 
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., 
> source=cd4730e3eae2,39077,1518106776411, 
> destination=cd4730e3eae2,40578,1518106776318 testMergeRegions 
> testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363491#comment-16363491
 ] 

Hadoop QA commented on HBASE-19116:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
58s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
38s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 0s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
3s{color} | {color:red} hbase-server: The patch generated 1 new + 17 unchanged 
- 3 fixed = 18 total (was 20) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
55s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 36s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 43s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
11s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.io.hfile.TestFixedFileTrailer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910500/HBASE-19116.branch-2.003.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux ac4a963f1653 3.13.0-133-generic #182-Ubuntu SMP Tue Sep 19 
15:49:21 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 4594f7156d |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11516/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11516/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11516/testReport/ |
| Max. process+thread count | 664 (vs. ulimit of

[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363489#comment-16363489
 ] 

stack commented on HBASE-18294:
---

bq. Globally the decision should be with ||. We have barrier for off heap and 
on heap memory and when any of the barrier is about to be crossed, it will 
result in forced flushes.

That sounds good. I did not get that from reading the release note (Yeah, add 
names of configs to toggle to Release Note).

I'm good w/ this. You [~anoop.hbase] ?




> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch, HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363488#comment-16363488
 ] 

Hadoop QA commented on HBASE-19996:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1.4 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
32s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_171 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
41s{color} | {color:green} branch-1.4 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} branch-1.4 passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} branch-1.4 passed with JDK v1.7.0_171 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.7.0_171 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
36s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 2.4.1 
2.5.2 2.6.5 2.7.4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_162 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.7.0_171 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
51s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 98m 
18s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363477#comment-16363477
 ] 

stack commented on HBASE-19116:
---

.003 addresses comments by Anoop up on rb.

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch, 
> HBASE-19116.branch-2.002.patch, HBASE-19116.branch-2.003.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363464#comment-16363464
 ] 

stack commented on HBASE-19965:
---

Here is for the test that timed out, build 314:

---
Test set: org.apache.hadoop.hbase.client.TestAsyncTableAdminApi
---
Tests run: 30, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 574.271 s <<< 
FAILURE! - in org.apache.hadoop.hbase.client.TestAsyncTableAdminApi
org.apache.hadoop.hbase.client.TestAsyncTableAdminApi  Time elapsed: 8.443 s  
<<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds

org.apache.hadoop.hbase.client.TestAsyncTableAdminApi  Time elapsed: 8.473 s  
<<< ERROR!
java.lang.Exception: Appears to be stuck in thread DataXceiver for client 
DFSClient_NONMAPREDUCE_1381247601_23 at /127.0.0.1:40966 [Receiving block 
BP-1735548202-172.17.0.2-1518565636532:blk_1073741829_1005]

Parameterized there are about 29 tests. None takes a particularly long time:

{code}
 1 2018-02-13 23:47:40,557 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testCreateTableWithEmptyRowInTheSplitKeys[0] 
Thread=302, OpenFileDescriptor=1612, MaxFileDescriptor=1048576, 
SystemLoadAverage=2007, ProcessCount=17, AvailableMemoryMB=21888
  2 2018-02-13 23:47:40,659 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testDeleteTable[0] Thread=305, 
OpenFileDescriptor=1614, MaxFileDescriptor=1048576, SystemLoadAverage=1982, 
ProcessCount=17, AvailableMemoryMB=21890
  3 2018-02-13 23:47:51,503 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testDisableAndEnableTables[0] Thread=320, 
OpenFileDescriptor=1616, MaxFileDescriptor=1048576, SystemLoadAverage=1827, 
ProcessCount=17, AvailableMemoryMB=21940
  4 2018-02-13 23:48:32,308 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testCreateTable[0] Thread=344, 
OpenFileDescriptor=1589, MaxFileDescriptor=1048576, SystemLoadAverage=1798, 
ProcessCount=17, AvailableMemoryMB=22132
  5 2018-02-13 23:48:41,898 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testCreateTableWithRegions[0] Thread=344, 
OpenFileDescriptor=1586, MaxFileDescriptor=1048576, SystemLoadAverage=1867, 
ProcessCount=17, AvailableMemoryMB=21816
  6 2018-02-13 23:49:23,348 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testIsTableEnabledAndDisabled[0] Thread=435, 
OpenFileDescriptor=1571, MaxFileDescriptor=1048576, SystemLoadAverage=1732, 
ProcessCount=17, AvailableMemoryMB=21124
  7 2018-02-13 23:49:36,012 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testListTables[0] Thread=441, 
OpenFileDescriptor=1577, MaxFileDescriptor=1048576, SystemLoadAverage=1809, 
ProcessCount=17, AvailableMemoryMB=20713
  8 2018-02-13 23:50:08,285 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testTruncateTablePreservingSplits[0] Thread=375, 
OpenFileDescriptor=1567, MaxFileDescriptor=1048576, SystemLoadAverage=1902, 
ProcessCount=17, AvailableMemoryMB=20604
  9 2018-02-13 23:50:26,969 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testCreateTableNumberOfRegions[0] Thread=373, 
OpenFileDescriptor=1590, MaxFileDescriptor=1048576, SystemLoadAverage=1889, 
ProcessCount=17, AvailableMemoryMB=20979
 10 2018-02-13 23:51:18,514 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testDisableAndEnableTable[0] Thread=439, 
OpenFileDescriptor=1569, MaxFileDescriptor=1048576, SystemLoadAverage=1876, 
ProcessCount=17, AvailableMemoryMB=20712
 11 2018-02-13 23:51:40,787 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testEnableTableRetainAssignment[0] Thread=430, 
OpenFileDescriptor=1575, MaxFileDescriptor=1048576, SystemLoadAverage=1754, 
ProcessCount=17, AvailableMemoryMB=20766
 12 2018-02-13 23:51:59,198 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testTruncateTable[0] Thread=473, 
OpenFileDescriptor=1578, MaxFileDescriptor=1048576, SystemLoadAverage=1826, 
ProcessCount=17, AvailableMemoryMB=21183
 13 2018-02-13 23:52:17,821 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before: 
client.TestAsyncTableAdminApi#testGetTableDescriptor[0] Thread=446, 
OpenFileDescriptor=1576, MaxFileDescriptor=1048576, SystemLoadAverage=1827, 
ProcessCount=17, AvailableMemoryMB=21732
 14 2018-02-13 23:52:27,280 INFO  [Time-limited test] 
hbase.ResourceChecker(148): before:

[jira] [Updated] (HBASE-19997) [rolling upgrade] 1.x => 2.x

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19997:
--
Fix Version/s: (was: 2.0.0)
   2.0.0-beta-2

> [rolling upgrade] 1.x => 2.x
> 
>
> Key: HBASE-19997
> URL: https://issues.apache.org/jira/browse/HBASE-19997
> Project: HBase
>  Issue Type: Umbrella
>Reporter: stack
>Priority: Major
> Fix For: 2.0.0-beta-2
>
>
> An umbrella issue of issues needed so folks can do a rolling upgrade from 
> hbase-1.x to hbase-2.x.
> (Recent) Notables:
>  * hbase-1.x can't read hbase-2.x WALs -- hbase-1.x doesn't know the 
> AsyncProtobufLogWriter class used writing the WAL -- see 
> https://issues.apache.org/jira/browse/HBASE-19166?focusedCommentId=16362897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16362897
>  for exception.
>  ** Might be ok... means WAL split fails on an hbase1 RS... must wait till an 
> hbase-2.x RS picks up the WAL for it to be split.
>  * hbase-1 can't open regions from tables created by hbase-2; it can't find 
> the Table descriptor. See 
> https://issues.apache.org/jira/browse/HBASE-19116?focusedCommentId=16363276=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16363276
>  ** This might be ok if the tables we are doing rolling upgrade over were 
> written with hbase-1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-19997) [rolling upgrade] 1.x => 2.x

2018-02-13 Thread stack (JIRA)

stack created HBASE-19997:
-

 Summary: [rolling upgrade] 1.x => 2.x
 Key: HBASE-19997
 URL: https://issues.apache.org/jira/browse/HBASE-19997
 Project: HBase
  Issue Type: Umbrella
Reporter: stack
 Fix For: 2.0.0


An umbrella issue of issues needed so folks can do a rolling upgrade from 
hbase-1.x to hbase-2.x.

(Recent) Notables:
 * hbase-1.x can't read hbase-2.x WALs -- hbase-1.x doesn't know the 
AsyncProtobufLogWriter class used writing the WAL -- see 
https://issues.apache.org/jira/browse/HBASE-19166?focusedCommentId=16362897=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16362897
 for exception.
 ** Might be ok... means WAL split fails on an hbase1 RS... must wait till an 
hbase-2.x RS picks up the WAL for it to be split.
 * hbase-1 can't open regions from tables created by hbase-2; it can't find the 
Table descriptor. See 
https://issues.apache.org/jira/browse/HBASE-19116?focusedCommentId=16363276=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16363276
 ** This might be ok if the tables we are doing rolling upgrade over were 
written with hbase-1.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363458#comment-16363458
 ] 

Anoop Sam John commented on HBASE-19166:


Get you..  Would be sweet if the fail reassign do not happen. May be some other 
changes will be there which wont allow the split to be done by a 1.x RS even if 
we solve this write/reader name issue?  

> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19116:
--
Attachment: HBASE-19116.branch-2.003.patch

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch, 
> HBASE-19116.branch-2.002.patch, HBASE-19116.branch-2.003.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363456#comment-16363456
 ] 

stack commented on HBASE-19166:
---

bq.  When the cluster is a mix of HBase 1 and 2 RSs (upgrade in progress) and 
one 2.0 RS crashed and the WAL split is been done by a 1.x server? Am I missing 
any?

You are not missing anything. My thought is Master will put up the WAL for 
splitting, the hbase1 RS will grab it and try to split, fail because it is 
hbase2... and this will go on until a hbase2 RS grabs the WAL. Meantime, we'll 
be adding more RS. I think that will work. We need to spend time on it.

> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363452#comment-16363452
 ] 

Anoop Sam John commented on HBASE-19166:


bq.On an hbase1 splitting hbase2 logs and failing as per the above, that might 
be ok;
That should be an issue no?  When the cluster is a mix of HBase 1 and 2 RSs 
(upgrade in progress)  and one  2.0 RS crashed and the WAL split is been done 
by a 1.x server?

> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363452#comment-16363452
 ] 

Anoop Sam John edited comment on HBASE-19166 at 2/14/18 3:56 AM:
-

bq.On an hbase1 splitting hbase2 logs and failing as per the above, that might 
be ok;
That should be an issue no?  When the cluster is a mix of HBase 1 and 2 RSs 
(upgrade in progress)  and one  2.0 RS crashed and the WAL split is been done 
by a 1.x server?  Am I missing any?


was (Author: anoop.hbase):
bq.On an hbase1 splitting hbase2 logs and failing as per the above, that might 
be ok;
That should be an issue no?  When the cluster is a mix of HBase 1 and 2 RSs 
(upgrade in progress)  and one  2.0 RS crashed and the WAL split is been done 
by a 1.x server?

> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-13 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363444#comment-16363444
 ] 

Anoop Sam John commented on HBASE-18294:


bq.If the offheap is 100x the onheap in size, and the threshold is set to 
offheap (100x) + onheap (1x) – i.e. 101x – then what happens when the onheap 
occupancy exceeds 1x?
This is about the per region flush decision boss.  Correct me if wrong 
[~eshcar].   Globally the decision should be with ||.  We have barrier  for off 
heap and on heap memory and when any of the barrier is about to be crossed, it 
will result in forced flushes.

> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch, HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19852) HBase Thrift 1 server SPNEGO Improvements

2018-02-13 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363441#comment-16363441
 ] 

Kevin Risden commented on HBASE-19852:
--

Thanks for the pointers [~carp84]. I've made some good progress on tests for 
this. I should have a patch up soon.

> HBase Thrift 1 server SPNEGO Improvements
> -
>
> Key: HBASE-19852
> URL: https://issues.apache.org/jira/browse/HBASE-19852
> Project: HBase
>  Issue Type: Improvement
>  Components: Thrift
>Reporter: Kevin Risden
>Assignee: Kevin Risden
>Priority: Major
> Attachments: HBASE-19852.master.001.patch
>
>
> HBase Thrift1 server has some issues when trying to use SPNEGO.
> From mailing list:
> http://mail-archives.apache.org/mod_mbox/hbase-user/201801.mbox/%3CCAJU9nmh5YtZ%2BmAQSLo91yKm8pRVzAPNLBU9vdVMCcxHRtRqgoA%40mail.gmail.com%3E
> {quote}While setting up the HBase Thrift server with HTTP, there were a
> significant amount of 401 errors where the HBase Thrift wasn't able to
> handle the incoming Kerberos request. Documentation online is sparse when
> it comes to setting up the principal/keytab for HTTP Kerberos.
> I noticed that the HBase Thrift HTTP implementation was missing SPNEGO
> principal/keytab like other Thrift based servers (HiveServer2). It looks
> like HiveServer2 Thrift implementation and HBase Thrift v1 implementation
> were very close to the same at one point. I made the following changes to
> HBase Thrift v1 server implementation to make it work:
> * add SPNEGO principal/keytab if in HTTP mode
> * return 401 immediately if no authorization header instead of waiting for
> try/catch down in program flow{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363432#comment-16363432
 ] 

Hadoop QA commented on HBASE-19116:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
43s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 6s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 6s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
6s{color} | {color:red} hbase-server: The patch generated 1 new + 17 unchanged 
- 3 fixed = 18 total (was 20) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
17m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}126m 
34s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}160m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910477/HBASE-19116.branch-2.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2be1222e5016 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 
19:09:19 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 4594f7156d |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11513/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11513/testReport/ |
| Max. process+thread count | 5026 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11513/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This

[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19996:
---
Attachment: HBASE-19996.branch-1.4.001.patch

> Some nonce procs might not be cleaned up (follow up HBASE-19756)
> 
>
> Key: HBASE-19996
> URL: https://issues.apache.org/jira/browse/HBASE-19996
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2
>
> Attachments: HBASE-19996.branch-1.4.001.patch, 
> HBASE-19996.branch-1.4.001.patch
>
>
> Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
> Unfortunately, the patch for branch-1 might not remove some valid procs too. 
> The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
> also adds another test to branch-2. Thanks to [~toffer] for flagging this 
> internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread Appy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363378#comment-16363378
 ] 

Appy commented on HBASE-19988:
--

Sorry, i don't have time to dig in and come up with a better understanding of 
handling InterruptedException when processing requests.
In this case, since IE was already being converted to IIOE, that means any 
other operation would have been handling it like IOException, which means 
cancel the operation. Going by that logic, and status quo bias (that it's 
already IIOE), i think it might be fine to do this.

However, I think it'll be better to handle it as part of IOException by doing 
{code}
if (isAtomic() or ioe instanceof IIOE) { throw ioe; }
{code}
because  it'll log a good warning.
Maybe move TimeoutIOException there too.

Currently the comment says "// We will retry when other exceptions, but we 
should stop if we timeout ."
Should be updated with reasons why we break out for each type. Let's not leave 
things in more dismay for future onlookers (why these two? why not others? etc 
etc). They shouldn't have to spend the time we already did, else our effort is 
wasted.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-17472) Correct the semantic of permission grant

2018-02-13 Thread Appy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363368#comment-16363368
 ] 

Appy commented on HBASE-17472:
--

Sorry for ultra late review.
Seeing the final patch which was committed to branch-1.4, the value of flag is 
always false, and the one which was committed to master, the flag is always 
true for production code (there are a few false in only test code, but that 
shouldn't count).
Going by that high level picture, it feels like we didn't need to make any 
change in branch-1.4 since adding a param always setting it to false is a no-op.
And for master, only the change to AccessControlLists#addUserPermission would 
have been sufficient.
We didn't need any new param or updating anything else.
What am i missing?

> Correct the semantic of  permission grant
> -
>
> Key: HBASE-17472
> URL: https://issues.apache.org/jira/browse/HBASE-17472
> Project: HBase
>  Issue Type: Improvement
>  Components: Admin
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17472.branch-1.3.v6.patch, 
> HBASE-17472.branch-1.v6.patch, HBASE-17472.branch-1.v7.patch, 
> HBASE-17472.master.v6.patch, HBASE-17472.master.v6.patch, 
> HBASE-17472.master.v7.patch, HBASE-17472.v1.patch, HBASE-17472.v2.patch, 
> HBASE-17472.v3.patch, HBASE-17472.v4.patch, HBASE-17472.v5.patch
>
>
> Currently, HBase grant operation has following semantic:
> {code}
> hbase(main):019:0> grant 'hbase_tst', 'RW', 'ycsb'
> 0 row(s) in 0.0960 seconds
> hbase(main):020:0> user_permission 'ycsb'
> User 
> Namespace,Table,Family,Qualifier:Permission   
>   
>   
> 
>  hbase_tst   default,ycsb,,: 
> [Permission:actions=READ,WRITE]   
>   
>   
> 1 row(s) in 0.0550 seconds
> hbase(main):021:0> grant 'hbase_tst', 'CA', 'ycsb'
> 0 row(s) in 0.0820 seconds
> hbase(main):022:0> user_permission 'ycsb'
> User 
> Namespace,Table,Family,Qualifier:Permission   
>   
>   
>  hbase_tst   default,ycsb,,: 
> [Permission: actions=CREATE,ADMIN]
>   
>   
> 1 row(s) in 0.0490 seconds
> {code}  
> Later permission will replace previous granted permissions, which confused 
> most of HBase administrator.
> It's seems more reasonable that HBase merge multiple granted permission.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (HBASE-19965) Fix flaky TestAsyncRegionAdminApi

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-19965.
---
Resolution: Fixed

This fell off the flakies list. The change in TestAsyncTableAdminApi is not 
enough... 
https://builds.apache.org/job/HBase%20Nightly/job/branch-2/314/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncTableAdminApi/org_apache_hadoop_hbase_client_TestAsyncTableAdminApi/
 Let me move some more over to TestAsyncTableAdminApi2 or make a 
TestAsyncTableAdminApi3.

> Fix flaky TestAsyncRegionAdminApi
> -
>
> Key: HBASE-19965
> URL: https://issues.apache.org/jira/browse/HBASE-19965
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19965.branch-2.001.patch
>
>
> See 
> [https://builds.apache.org/job/HBase%20Nightly/job/branch-2/284/testReport/junit/org.apache.hadoop.hbase.client/TestAsyncRegionAdminApi/testMergeRegions_0_/]
>  
> java.lang.AssertionError: expected:<2> but was:<3> at 
> org.apache.hadoop.hbase.client.TestAsyncRegionAdminApi.testMergeRegions(TestAsyncRegionAdminApi.java:359)
>  
> Merge regions not work. The table still have 3 regions after the 
> MergeRegionsProcedure finished.
> The master start balance region 9e2773ba1efba79a2defa276e9a26ed4. But because 
> the MergeRegionsProcedure pid=138 start work first, so the balance need wait 
> for the lock. But after merge regions finished, the MoveRegionProcedure 
> pid=139 start work and assign 9e2773ba1efba79a2defa276e9a26ed4 to a new 
> region server. This is not right. The MoveRegionProcedure should skip to 
> assign a region which was marked as offline. Or we should clear the merged 
> regions' procedure when MergeRegionsProcedure finished.
>  
> Logs:
> 2018-02-08 16:24:44,608 INFO [master/cd4730e3eae2:0.Chore.1] 
> master.HMaster(1454): balance 
> hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., 
> source=cd4730e3eae2,39077,1518106776411, 
> destination=cd4730e3eae2,40578,1518106776318
> 2018-02-08 16:24:44,608 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=37885] 
> procedure2.ProcedureExecutor(868): Stored pid=138, 
> state=RUNNABLE:MERGE_TABLE_REGIONS_PREPARE; MergeTableRegionsProcedure 
> table=testMergeRegions, regions=[9e2773ba1efba79a2defa276e9a26ed4, 
> 8f8fd5cd032313e1aadb83e31e1b7479], forcibly=false
> ..
> 2018-02-08 16:24:50,111 INFO [PEWorker-13] 
> procedure2.ProcedureExecutor(1249): Finished pid=138, state=SUCCESS; 
> MergeTableRegionsProcedure table=testMergeRegions, 
> regions=[9e2773ba1efba79a2defa276e9a26ed4, 8f8fd5cd032313e1aadb83e31e1b7479], 
> forcibly=false in 5.5710sec
> 2018-02-08 16:24:50,113 INFO [PEWorker-13] 
> procedure.MasterProcedureScheduler(813): pid=139, 
> state=RUNNABLE:MOVE_REGION_UNASSIGN; MoveRegionProcedure 
> hri=testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4., 
> source=cd4730e3eae2,39077,1518106776411, 
> destination=cd4730e3eae2,40578,1518106776318 testMergeRegions 
> testMergeRegions,,1518107079782.9e2773ba1efba79a2defa276e9a26ed4.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363356#comment-16363356
 ] 

Hadoop QA commented on HBASE-19996:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  4m 
44s{color} | {color:red} Docker failed to build yetus/hbase:74e3133. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-19996 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910485/HBASE-19996.branch-1.4.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11514/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Some nonce procs might not be cleaned up (follow up HBASE-19756)
> 
>
> Key: HBASE-19996
> URL: https://issues.apache.org/jira/browse/HBASE-19996
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2
>
> Attachments: HBASE-19996.branch-1.4.001.patch
>
>
> Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
> Unfortunately, the patch for branch-1 might not remove some valid procs too. 
> The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
> also adds another test to branch-2. Thanks to [~toffer] for flagging this 
> internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Thiruvel Thirumoolan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-19996:
-
Description: Follow up to HBASE-19756 which dealt with NPEs during proc 
cleanup. Unfortunately, the patch for branch-1 might not remove some valid 
procs too. The branch-2 patch doesn't have this problem. This fixes the 
branch-1 bug and also adds another test to branch-2. Thanks to [~toffer] for 
flagging this internally.  (was: Follow up to HBASE-19756 which dealt with NPEs 
during proc cleanup. Unfortunately, the patch for branch-1 might not remove 
some valid procs too. The branch-2 patch doesn't have this problem. This fixes 
the branch-1 bug and also adds another test to branch-2.)

> Some nonce procs might not be cleaned up (follow up HBASE-19756)
> 
>
> Key: HBASE-19996
> URL: https://issues.apache.org/jira/browse/HBASE-19996
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2
>
> Attachments: HBASE-19996.branch-1.4.001.patch
>
>
> Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
> Unfortunately, the patch for branch-1 might not remove some valid procs too. 
> The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
> also adds another test to branch-2. Thanks to [~toffer] for flagging this 
> internally.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Thiruvel Thirumoolan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-19996:
-
Status: Patch Available  (was: Open)

> Some nonce procs might not be cleaned up (follow up HBASE-19756)
> 
>
> Key: HBASE-19996
> URL: https://issues.apache.org/jira/browse/HBASE-19996
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2
>
> Attachments: HBASE-19996.branch-1.4.001.patch
>
>
> Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
> Unfortunately, the patch for branch-1 might not remove some valid procs too. 
> The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
> also adds another test to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Thiruvel Thirumoolan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-19996:
-
Attachment: HBASE-19996.branch-1.4.001.patch

> Some nonce procs might not be cleaned up (follow up HBASE-19756)
> 
>
> Key: HBASE-19996
> URL: https://issues.apache.org/jira/browse/HBASE-19996
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2
>
> Attachments: HBASE-19996.branch-1.4.001.patch
>
>
> Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
> Unfortunately, the patch for branch-1 might not remove some valid procs too. 
> The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
> also adds another test to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19756) Master NPE during completed failed proc eviction

2018-02-13 Thread Thiruvel Thirumoolan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363345#comment-16363345
 ] 

Thiruvel Thirumoolan commented on HBASE-19756:
--

[~apurtell]/[~yuzhih...@gmail.com] - The master patch here is fine, I wanted to 
rework on branch-1 patch, but fell sick and patch got committed within that. 
Raised HBASE-19996 as a followup to fix the problem with branch-1 patch.

> Master NPE during completed failed proc eviction
> 
>
> Key: HBASE-19756
> URL: https://issues.apache.org/jira/browse/HBASE-19756
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.3.1
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 3.0.0, 1.3.2, 1.4.1, 1.5.0
>
> Attachments: HBASE-19756.branch-1.4.001.patch, 
> HBASE-19756.branch-1.4.002.patch, HBASE-19756.branch-1.4.003.patch, 
> HBASE-19756.master.001.patch
>
>
> When procedures like Create table fails due to say AccessDeniedException, 
> then a rollback procedure is created. When the rollback is being cleaned up, 
> it results in an NPE because those nonce procs aren't persisted
> Stack trace when this happens:
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504)
> at 
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78)
> at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Thiruvel Thirumoolan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thiruvel Thirumoolan updated HBASE-19996:
-
Fix Version/s: 1.4.2
   1.5.0
   1.3.2
   2.0.0

> Some nonce procs might not be cleaned up (follow up HBASE-19756)
> 
>
> Key: HBASE-19996
> URL: https://issues.apache.org/jira/browse/HBASE-19996
> Project: HBase
>  Issue Type: Bug
>Reporter: Thiruvel Thirumoolan
>Assignee: Thiruvel Thirumoolan
>Priority: Major
> Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2
>
>
> Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
> Unfortunately, the patch for branch-1 might not remove some valid procs too. 
> The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
> also adds another test to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)

2018-02-13 Thread Thiruvel Thirumoolan (JIRA)

Thiruvel Thirumoolan created HBASE-19996:


 Summary: Some nonce procs might not be cleaned up (follow up 
HBASE-19756)
 Key: HBASE-19996
 URL: https://issues.apache.org/jira/browse/HBASE-19996
 Project: HBase
  Issue Type: Bug
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan


Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. 
Unfortunately, the patch for branch-1 might not remove some valid procs too. 
The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and 
also adds another test to branch-2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363338#comment-16363338
 ] 

stack commented on HBASE-19988:
---

bq. Been reading around code for last 30 min, i honestly have no idea how are 
we supposed to interpret InterruptedException.

IE handling is erratic. Some code lines are non-interruptible (HDFS, client 
retries...). Generally, if you have an IE and don't know what to do w/ it, do 
clean up, set interrupt on thread and rethrow. A good project would be going 
though the codebase throwing IEs to see what happens.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363330#comment-16363330
 ] 

Umesh Agashe commented on HBASE-19988:
--

As surefire is able to interrupt tests suggests that InterruptedException is 
not ignored always/ from everywhere.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363328#comment-16363328
 ] 

stack commented on HBASE-18294:
---

On release note, named the configs to set?

On the patch, couldn't the memstoreSize change... between leaving the 
synchronize block and going in here to do the check?

checkNegativeMemStoreDataSize(size, -memStoreSize.getDataSize());

Copy the datasize to a local variable inside the sync block?

Or nvm... I see that we are passing in the passed-in param, not the data member 
content. In that case, its confusing have param same name as a data member. Can 
lead to confusion.

We are doing this...

1263  public long getMemStoreDataSize() {
1264return memStoreSize.getDataSize();
1265  }

.. w/o a synchronize. Should there be one? ... Hmm... No, it should be ok. It 
is a volatile read. Ignore.

Interesting, so looking for best region to flush, we'll do data size...

176 (regionToFlush != null && regionToFlush.getMemStoreDataSize() > 
0) ||
177 (bestRegionReplica != null && 
bestRegionReplica.getMemStoreDataSize() > 0));


The data size accounting is just a nice-to-have in the scheme of things? (A 
vestige held over from the back and forth here).

This is right?

91long getMemStoreSize() {
92  return region.getMemStoreSize();92  return 
region.getMemStoreDataSize();

... i.e. returing data size when we ask for memstoresize? (We also have a 
getMemStoreDataSize ...)

Did a pass. Looks good to me.





> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch, HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread Appy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363326#comment-16363326
 ] 

Appy commented on HBASE-19988:
--

Been reading around code for last 30 min, i honestly have no idea how are we 
supposed to interpret InterruptedException.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19981) Boolean#getBoolean is used to parse value

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363291#comment-16363291
 ] 

Hudson commented on HBASE-19981:


SUCCESS: Integrated in Jenkins build HBase-1.3-IT #349 (See 
[https://builds.apache.org/job/HBase-1.3-IT/349/])
HBASE-19981 Boolean#getBoolean is used to parse value (tedyu: rev 
e6dda8ea6db4e50e3bc3e93a72dc06f433a75b58)
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java


> Boolean#getBoolean is used to parse value
> -
>
> Key: HBASE-19981
> URL: https://issues.apache.org/jira/browse/HBASE-19981
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Janos Gub
>Priority: Major
> Fix For: 1.3.2, 1.2.7, 1.4.2
>
> Attachments: HBASE-19981.branch-1.001.patch
>
>
> In HColumnDescriptor of branch-1:
> {code}
>   value.set(Bytes.toBytes(
>   Boolean.getBoolean(Bytes.toString(value.get()))
> {code}
> According to 
> https://docs.oracle.com/javase/7/docs/api/java/lang/Boolean.html#getBoolean(java.lang.String):
> {code}
> Returns true if and only if the system property named by the argument exists 
> and is equal to the string "true"
> {code}
> This was not the intention of the quoted code.
> This was discovered by Fortify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19981) Boolean#getBoolean is used to parse value

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363285#comment-16363285
 ] 

Hudson commented on HBASE-19981:


SUCCESS: Integrated in Jenkins build HBase-1.2-IT #1069 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1069/])
HBASE-19981 Boolean#getBoolean is used to parse value (tedyu: rev 
0f3bf54899e4d8927f76f9e9515e774590ad56eb)
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java


> Boolean#getBoolean is used to parse value
> -
>
> Key: HBASE-19981
> URL: https://issues.apache.org/jira/browse/HBASE-19981
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Janos Gub
>Priority: Major
> Fix For: 1.3.2, 1.2.7, 1.4.2
>
> Attachments: HBASE-19981.branch-1.001.patch
>
>
> In HColumnDescriptor of branch-1:
> {code}
>   value.set(Bytes.toBytes(
>   Boolean.getBoolean(Bytes.toString(value.get()))
> {code}
> According to 
> https://docs.oracle.com/javase/7/docs/api/java/lang/Boolean.html#getBoolean(java.lang.String):
> {code}
> Returns true if and only if the system property named by the argument exists 
> and is equal to the string "true"
> {code}
> This was not the intention of the quoted code.
> This was discovered by Fortify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19981) Boolean#getBoolean is used to parse value

2018-02-13 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-19981:
---
Fix Version/s: 1.2.7
   1.3.2

> Boolean#getBoolean is used to parse value
> -
>
> Key: HBASE-19981
> URL: https://issues.apache.org/jira/browse/HBASE-19981
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Janos Gub
>Priority: Major
> Fix For: 1.3.2, 1.2.7, 1.4.2
>
> Attachments: HBASE-19981.branch-1.001.patch
>
>
> In HColumnDescriptor of branch-1:
> {code}
>   value.set(Bytes.toBytes(
>   Boolean.getBoolean(Bytes.toString(value.get()))
> {code}
> According to 
> https://docs.oracle.com/javase/7/docs/api/java/lang/Boolean.html#getBoolean(java.lang.String):
> {code}
> Returns true if and only if the system property named by the argument exists 
> and is equal to the string "true"
> {code}
> This was not the intention of the quoted code.
> This was discovered by Fortify.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19116:
--
Attachment: HBASE-19116.branch-2.002.patch

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch, 
> HBASE-19116.branch-2.002.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363280#comment-16363280
 ] 

stack commented on HBASE-19116:
---

.002 Checkstyle fixes.

Review please.

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch, 
> HBASE-19116.branch-2.002.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363276#comment-16363276
 ] 

stack commented on HBASE-19116:
---

Here is reading a hbase2 hfile with an hbase1 reader:

stack@ve0524:~$ ./hbase/bin/hbase --config ~/conf_hbase/ 
org.apache.hadoop.hbase.io.hfile.HFile --printmeta -f 
/hbase/archive/data/default/IntegrationTestBigLinkedList/25eb09e8ddb00ea240407061e776a289/big/8e54a03ba0c14e458c57290f0b25373d
2018-02-13 16:02:54,864 WARN  [main] util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2018-02-13 16:02:55,284 INFO  [main] hfile.CacheConfig: Created cacheConfig: 
CacheConfig:disabled
Block index size as per heapsize: 53152
reader=/hbase/archive/data/default/IntegrationTestBigLinkedList/25eb09e8ddb00ea240407061e776a289/big/8e54a03ba0c14e458c57290f0b25373d,
compression=none,
cacheConf=CacheConfig:disabled,
firstKey=\xC7\x1Cr((?\x0E$\x1F\xAF\x966%1/big:big/1518565482300/Put,
lastKey=\xD5Vr_\x13\x9C\x10]\xAE\x19\xDE_9\x1A] Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363277#comment-16363277
 ] 

stack commented on HBASE-19116:
---

Need a review here please.

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18294) Reduce global heap pressure: flush based on heap occupancy

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363247#comment-16363247
 ] 

stack commented on HBASE-18294:
---

Nice release note [~eshcar].

bq. (2) A region is flushed when its on-heap+off-heap size exceeds the region 
flush threshold, 

If the offheap is 100x the onheap in size, and the threshold is set to offheap 
(100x) + onheap (1x) -- i.e. 101x -- then what happens when the onheap 
occupancy exceeds 1x?

Left feedback on RB. Thanks.

> Reduce global heap pressure: flush based on heap occupancy
> --
>
> Key: HBASE-18294
> URL: https://issues.apache.org/jira/browse/HBASE-18294
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.01.patch, 
> HBASE-18294.01.patch, HBASE-18294.01.patch, HBASE-18294.02.patch, 
> HBASE-18294.03.patch, HBASE-18294.04.patch, HBASE-18294.05.patch, 
> HBASE-18294.06.patch, HBASE-18294.07.patch, HBASE-18294.07.patch, 
> HBASE-18294.08.patch, HBASE-18294.09.patch, HBASE-18294.10.patch, 
> HBASE-18294.11.patch, HBASE-18294.11.patch, HBASE-18294.12.patch, 
> HBASE-18294.13.patch, HBASE-18294.15.patch, HBASE-18294.16.patch, 
> HBASE-18294.master.01.patch, HBASE-18294.master.01.patch
>
>
> A region is flushed if its memory component exceed a threshold (default size 
> is 128MB).
> A flush policy decides whether to flush a store by comparing the size of the 
> store to another threshold (that can be configured with 
> hbase.hregion.percolumnfamilyflush.size.lower.bound).
> Currently the implementation (in both cases) compares the data size 
> (key-value only) to the threshold where it should compare the heap size 
> (which includes index size, and metadata).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19972) Should rethrow the RetriesExhaustedWithDetailsException when failed to apply the batch in ReplicationSink

2018-02-13 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363245#comment-16363245
 ] 

Andrew Purtell commented on HBASE-19972:


[~Apache9] Sure, we can do a 1.4 release this month instead of waiting until 
next month. Will start on it today, expect a vote by/for next week.

> Should rethrow  the RetriesExhaustedWithDetailsException when failed to apply 
> the batch in ReplicationSink
> --
>
> Key: HBASE-19972
> URL: https://issues.apache.org/jira/browse/HBASE-19972
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19972-branch-1.4.patch, HBASE-19972.v1.patch, 
> HBASE-19972.v1.patch
>
>
> As [~Apache9] said in HBASE-12091. 
> In ReplicationSink#batch，we swallow the RetriesExhaustedWithDetailsException 
> except 
> TableNotFoundException,   actually,  should rethrow the exception. 
> {code:java}
> try {
>   Connection connection = getConnection();
>   table = connection.getTable(tableName);
>   for (List rows : allRows) {
> table.batch(rows);
>   }
> } catch (RetriesExhaustedWithDetailsException rewde) {
>   for (Throwable ex : rewde.getCauses()) {
> if (ex instanceof TableNotFoundException) {
>   throw new TableNotFoundException("'"+tableName+"'");
> }
>   }
> } 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-18282) ReplicationLogCleaner can delete WALs not yet replicated in case of a KeeperException

2018-02-13 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363234#comment-16363234
 ] 

Andrew Purtell commented on HBASE-18282:


Hi [~benlau], yes, please, and thank you in advance.

> ReplicationLogCleaner can delete WALs not yet replicated in case of a 
> KeeperException
> -
>
> Key: HBASE-18282
> URL: https://issues.apache.org/jira/browse/HBASE-18282
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.1, 1.2.6, 1.1.11, 2.0.0-alpha-1
>Reporter: Ashu Pachauri
>Priority: Critical
>
> ReplicationStateZKBase#getListOfReplicators does not rethrow a 
> KeeperException and returns null in such a case. ReplicationLogCleaner just 
> assumes that there are no replicators and deletes everything.
> ReplicationStateZKBase:
> {code:java}
> public List getListOfReplicators() {
> List result = null;
> try {
>   result = ZKUtil.listChildrenNoWatch(this.zookeeper, this.queuesZNode);
> } catch (KeeperException e) {
>   this.abortable.abort("Failed to get list of replicators", e);
> }
> return result;
>   }
> {code}
> ReplicationLogCleaner:
> {code:java}
> private Set loadWALsFromQueues() throws KeeperException {
> for (int retry = 0; ; retry++) {
>   int v0 = replicationQueues.getQueuesZNodeCversion();
>   List rss = replicationQueues.getListOfReplicators();
>   if (rss == null) {
> LOG.debug("Didn't find any region server that replicates, won't 
> prevent any deletions.");
> return ImmutableSet.of();
>   }
>   ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-18282) ReplicationLogCleaner can delete WALs not yet replicated in case of a KeeperException

2018-02-13 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-18282:
--

Assignee: (was: Ashu Pachauri)

> ReplicationLogCleaner can delete WALs not yet replicated in case of a 
> KeeperException
> -
>
> Key: HBASE-18282
> URL: https://issues.apache.org/jira/browse/HBASE-18282
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.1, 1.2.6, 1.1.11, 2.0.0-alpha-1
>Reporter: Ashu Pachauri
>Priority: Critical
>
> ReplicationStateZKBase#getListOfReplicators does not rethrow a 
> KeeperException and returns null in such a case. ReplicationLogCleaner just 
> assumes that there are no replicators and deletes everything.
> ReplicationStateZKBase:
> {code:java}
> public List getListOfReplicators() {
> List result = null;
> try {
>   result = ZKUtil.listChildrenNoWatch(this.zookeeper, this.queuesZNode);
> } catch (KeeperException e) {
>   this.abortable.abort("Failed to get list of replicators", e);
> }
> return result;
>   }
> {code}
> ReplicationLogCleaner:
> {code:java}
> private Set loadWALsFromQueues() throws KeeperException {
> for (int retry = 0; ; retry++) {
>   int v0 = replicationQueues.getQueuesZNodeCversion();
>   List rss = replicationQueues.getListOfReplicators();
>   if (rss == null) {
> LOG.debug("Didn't find any region server that replicates, won't 
> prevent any deletions.");
> return ImmutableSet.of();
>   }
>   ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363228#comment-16363228
 ] 

Hadoop QA commented on HBASE-19988:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
46s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
38s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
18m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m 31s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19988 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910445/hbase-19988.master.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d6a29d121b63 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 39e191e559 |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11512/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11512/testReport/ |
| Max. process+thread count | 5346 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
|

[jira] [Commented] (HBASE-19995) Current Jetty 9 version in HBase master branch can memory leak under high traffic

2018-02-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363226#comment-16363226
 ] 

Ted Yu commented on HBASE-19995:


Updating to 9.3.22.v20171030 is good.

> Current Jetty 9 version in HBase master branch can memory leak under high 
> traffic
> -
>
> Key: HBASE-19995
> URL: https://issues.apache.org/jira/browse/HBASE-19995
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 2.0
>Reporter: Ben Lau
>Priority: Major
>
> There is a memory-leak in Jetty 9 that manifests whenever you hit the call 
> queue limit in HBase REST.  The memory-leak leaks both on-heap and off-heap 
> objects permanently.  It happens because whenever the call queue for Jetty 
> server overflows, the task that is rejected runs a 'reject' method if it is a 
> Rejectable to do any cleanup. This clean up is necessary to for example close 
> the connection, deallocate any buffers, etc. Unfortunately, in Jetty 9, they 
> implemented the 'reject' / cleanup method of the SelectChannelEndpoint as a 
> non-blocking call that is not guaranteed to run.  This was later fixed in 
> Jetty 9.4 and later backported however the version of Jetty 9 pulled in HBase 
> for REST comes before this fix.  See 
> [https://github.com/eclipse/jetty.project/issues/1804] and 
> [https://github.com/apache/hbase/blob/master/pom.xml#L1416.]
> If we want to stay on 9.3.X we could update to 
> [9.3.22.v20171030|https://mvnrepository.com/artifact/org.eclipse.jetty/jetty-server/9.3.22.v20171030]
>  which is the latest version of 9.3.  Thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-19995) Current Jetty 9 version in HBase master branch can memory leak under high traffic

2018-02-13 Thread Ben Lau (JIRA)

Ben Lau created HBASE-19995:
---

 Summary: Current Jetty 9 version in HBase master branch can memory 
leak under high traffic
 Key: HBASE-19995
 URL: https://issues.apache.org/jira/browse/HBASE-19995
 Project: HBase
  Issue Type: Bug
  Components: REST
Affects Versions: 2.0
Reporter: Ben Lau


There is a memory-leak in Jetty 9 that manifests whenever you hit the call 
queue limit in HBase REST.  The memory-leak leaks both on-heap and off-heap 
objects permanently.  It happens because whenever the call queue for Jetty 
server overflows, the task that is rejected runs a 'reject' method if it is a 
Rejectable to do any cleanup. This clean up is necessary to for example close 
the connection, deallocate any buffers, etc. Unfortunately, in Jetty 9, they 
implemented the 'reject' / cleanup method of the SelectChannelEndpoint as a 
non-blocking call that is not guaranteed to run.  This was later fixed in Jetty 
9.4 and later backported however the version of Jetty 9 pulled in HBase for 
REST comes before this fix.  See 
[https://github.com/eclipse/jetty.project/issues/1804] and 
[https://github.com/apache/hbase/blob/master/pom.xml#L1416.]

If we want to stay on 9.3.X we could update to 
[9.3.22.v20171030|https://mvnrepository.com/artifact/org.eclipse.jetty/jetty-server/9.3.22.v20171030]
 which is the latest version of 9.3.  Thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19992) Hole in namespace table assign

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363212#comment-16363212
 ] 

stack commented on HBASE-19992:
---

This might be my fault going between hbase1 and hbase2 with different 
codebases. Leaving open for now. I thought it was the migration of hbase1 table 
state from zk setting table as enabled and so 'existing' but something else 
happened such that hbase:meta had no hbase:namespace mention. Leaving open for 
now in case I see this again in testing.

> Hole in namespace table assign
> --
>
> Key: HBASE-19992
> URL: https://issues.apache.org/jira/browse/HBASE-19992
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
>
> If the assign fails before it comes up in a Master initialization, the table 
> will have been created and may even be marked ENABLED successfully, but on 
> restart, we don't assign the table.
> Manifest is:
> {code}
> 2018-02-13 11:45:24,504 ERROR [master/ve0524:16000] master.HMaster: Failed to 
> become active master
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
>   at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
>   at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
>   at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:916)
>   at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: Timedout 30ms waiting for namespace table 
> to be assigned and enabled: ENABLED
>   at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
>   at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
>   at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
>   at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1050)
>   ... 4 more
> 2018-02-13 11:45:24,506 ERROR [master/ve0524:16000] master.HMaster: Master 
> server abort: loaded coprocessors are: 
> [org.apache.hadoop.hbase.security.access.AccessController]
> 2018-02-13 11:45:24,506 ERROR [master/ve0524:16000] master.HMaster: * 
> ABORTING master ve0524.halxg.cloudera.com,16000,1518550812400: Unhandled 
> exception. Starting shutdown. *
> java.lang.IllegalStateException: Expected the service 
> ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
>   at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
>   at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
>   at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
>   at 
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:916)
>   at 
> org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
>   at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
>   at java.lang.Thread.run(Thread.java:748)
>   
>   
> Caused by: 
> java.io.IOException: Timedout 30ms waiting for namespace table to be 
> assigned and enabled: ENABLED
>   at 
> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
>   at 
> org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
>   at 
> org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
>   
>   
>  at 
> org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1050)
>   ... 4 more
> {code}
> Last thing in log before Master crash was:
> 2018-02-13 11:34:17,084 INFO  [master/ve0524:16000] hbase.MetaTableAccessor: 
> Updated table hbase:namespace state to ENABLED in META
> There is no one doing an assign subsequent to initial create table.



--
This

[jira] [Commented] (HBASE-18282) ReplicationLogCleaner can delete WALs not yet replicated in case of a KeeperException

2018-02-13 Thread Ben Lau (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-18282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363168#comment-16363168
 ] 

Ben Lau commented on HBASE-18282:
-

Hi guys, this ticket has been open for a while.  Do you mind if we submit an 
internal patch + test we have for this?

> ReplicationLogCleaner can delete WALs not yet replicated in case of a 
> KeeperException
> -
>
> Key: HBASE-18282
> URL: https://issues.apache.org/jira/browse/HBASE-18282
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.3.1, 1.2.6, 1.1.11, 2.0.0-alpha-1
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
>Priority: Critical
>
> ReplicationStateZKBase#getListOfReplicators does not rethrow a 
> KeeperException and returns null in such a case. ReplicationLogCleaner just 
> assumes that there are no replicators and deletes everything.
> ReplicationStateZKBase:
> {code:java}
> public List getListOfReplicators() {
> List result = null;
> try {
>   result = ZKUtil.listChildrenNoWatch(this.zookeeper, this.queuesZNode);
> } catch (KeeperException e) {
>   this.abortable.abort("Failed to get list of replicators", e);
> }
> return result;
>   }
> {code}
> ReplicationLogCleaner:
> {code:java}
> private Set loadWALsFromQueues() throws KeeperException {
> for (int retry = 0; ; retry++) {
>   int v0 = replicationQueues.getQueuesZNodeCversion();
>   List rss = replicationQueues.getListOfReplicators();
>   if (rss == null) {
> LOG.debug("Didn't find any region server that replicates, won't 
> prevent any deletions.");
> return ImmutableSet.of();
>   }
>   ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19991) lots of hbase-rest test failures against hadoop 3

2018-02-13 Thread Mike Drob (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19991:
--
Attachment: HBASE-19991.WIP.patch

> lots of hbase-rest test failures against hadoop 3
> -
>
> Key: HBASE-19991
> URL: https://issues.apache.org/jira/browse/HBASE-19991
> Project: HBase
>  Issue Type: Bug
>  Components: REST, test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-19991.WIP.patch
>
>
> mvn clean test -pl hbase-rest -Dhadoop.profile=3.0
> [ERROR] Tests run: 106, Failures: 95, Errors: 8, Skipped: 1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19991) lots of hbase-rest test failures against hadoop 3

2018-02-13 Thread Mike Drob (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363164#comment-16363164
 ] 

Mike Drob commented on HBASE-19991:
---

This is failing due to loading jersey-1 classes via hadoop in the hadoop-3 
configuration.

This patch is my WIP, but I don't see anything jersey-1 left in dependency:tree 
report.

> lots of hbase-rest test failures against hadoop 3
> -
>
> Key: HBASE-19991
> URL: https://issues.apache.org/jira/browse/HBASE-19991
> Project: HBase
>  Issue Type: Bug
>  Components: REST, test
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 2.0.0
>
>
> mvn clean test -pl hbase-rest -Dhadoop.profile=3.0
> [ERROR] Tests run: 106, Failures: 95, Errors: 8, Skipped: 1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-15911) NPE in AssignmentManager.onRegionTransition after Master restart

2018-02-13 Thread Ben Lau (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363161#comment-16363161
 ] 

Ben Lau commented on HBASE-15911:
-

[~pankaj2461] [~mantonov] We recently ran into this and had to fix this as it 
was preventing our master from starting up.  We would like to submit a 
suggested fix and test case if you guys do not have a patch yet.

> NPE in AssignmentManager.onRegionTransition after Master restart
> 
>
> Key: HBASE-15911
> URL: https://issues.apache.org/jira/browse/HBASE-15911
> Project: HBase
>  Issue Type: Bug
>  Components: master, Region Assignment
>Affects Versions: 1.3.0
>Reporter: Mikhail Antonov
>Assignee: Mikhail Antonov
>Priority: Major
>
> 16/05/27 17:49:18 ERROR ipc.RpcServer: Unexpected throwable object 
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.master.AssignmentManager.onRegionTransition(AssignmentManager.java:4364)
>   at 
> org.apache.hadoop.hbase.master.MasterRpcServices.reportRegionStateTransition(MasterRpcServices.java:1421)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:8623)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2239)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:116)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:137)
>   at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:112)
>   at java.lang.Thread.run(Thread.java:745)
> I'm pretty sure I've seen it before and more than once, but never got to dig 
> in.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19993) Publish tests jar for hbase-zookeeper in bin tarball

2018-02-13 Thread Appy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363158#comment-16363158
 ] 

Appy commented on HBASE-19993:
--

Ping [~Apache9], [~stack]

> Publish tests jar for hbase-zookeeper in bin tarball
> 
>
> Key: HBASE-19993
> URL: https://issues.apache.org/jira/browse/HBASE-19993
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>Priority: Major
>
> Since {{HBTU extends HBZKTU}} (such short forms! i know!), we need to publish 
> hbase-zookeeper's tests jar too. Many IT tests use HBTU.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-19994) Create a new class for RPC throttling exception, make it retryable.

2018-02-13 Thread huaxiang sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun reassigned HBASE-19994:


Assignee: huaxiang sun

> Create a new class for RPC throttling exception, make it retryable. 
> 
>
> Key: HBASE-19994
> URL: https://issues.apache.org/jira/browse/HBASE-19994
> Project: HBase
>  Issue Type: Improvement
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Minor
>
> Based on a discussion at dev mailing list.
>  
> {code:java}
> Thanks Andrew.
> +1 for the second option, I will create a jira for this change.
> Huaxiang
> On Feb 9, 2018, at 1:09 PM, Andrew Purtell  wrote:
> We have
> public class ThrottlingException extends QuotaExceededException
> public class QuotaExceededException extends DoNotRetryIOException
> Let the storage quota limits throw QuotaExceededException directly (based
> on DNRIOE). That seems fine.
> However, ThrottlingException is thrown as a result of a temporal quota,
> so it is inappropriate for this to inherit from DNRIOE, it should inherit
> IOException instead so the client is allowed to retry until successful, or
> until the retry policy is exhausted.
> We are in a bit of a pickle because we've released with this inheritance
> hierarchy, so to change it we will need a new minor, or we will want to
> deprecate ThrottlingException and use a new exception class instead, one
> which does not inherit from DNRIOE.
> On Feb 7, 2018, at 9:25 AM, Huaxiang Sun  wrote:
> Hi Mike,
>   You are right. For rpc throttling, definitely it is retryable. For storage 
> quota, I think it will be fail faster (non-retryable).
>   We probably need to separate these two types of exceptions, I will do some 
> more research and follow up.
>   Thanks,
>   Huaxiang
> On Feb 7, 2018, at 9:16 AM, Mike Drob  wrote:
> I think, philosophically, there can be two kinds of QEE -
> For throttling, we can retry. The quota is a temporal quota - you have done
> too many operations this minute, please try again next minute and
> everything will work.
> For storage, we shouldn't retry. The quota is a fixed quote - you have
> exceeded your allotted disk space, please do not try again until you have
> remedied the situation.
> Our current usage conflates the two, sometimes it is correct, sometimes not.
> On Wed, Feb 7, 2018 at 11:00 AM, Huaxiang Sun  wrote:
> Hi Stack,
>  I run into a case that a mapreduce job in hive cannot finish because
> it runs into a QEE.
> I need to look into the hive mr task to see if QEE is not handled
> correctly in hbase code or in hive code.
> I am thinking that if  QEE is a retryable exception, then it should be
> taken care of by the hbase code.
> I will check more and report back.
> Thanks,
> Huaxiang
> On Feb 7, 2018, at 8:23 AM, Stack  wrote:
> QEE being a DNRIOE seems right on the face of it.
> But if throttling, a DNRIOE is inappropriate. Where you seeing a QEE in a
> throttling scenario Huaxiang?
> Thanks,
> S
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19993) Publish tests jar for hbase-zookeeper in bin tarball

2018-02-13 Thread Appy (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363155#comment-16363155
 ] 

Appy commented on HBASE-19993:
--

eh? it's there in beta-1 bin tarball. How? Even though we are not copying it 
explicitly like other tests jar 
([https://github.com/apache/hbase/blob/branch-2/hbase-assembly/src/main/assembly/components.xml#L110])

> Publish tests jar for hbase-zookeeper in bin tarball
> 
>
> Key: HBASE-19993
> URL: https://issues.apache.org/jira/browse/HBASE-19993
> Project: HBase
>  Issue Type: Bug
>Reporter: Appy
>Assignee: Appy
>Priority: Major
>
> Since {{HBTU extends HBZKTU}} (such short forms! i know!), we need to publish 
> hbase-zookeeper's tests jar too. Many IT tests use HBTU.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-19994) Create a new class for RPC throttling exception, make it retryable.

2018-02-13 Thread huaxiang sun (JIRA)

huaxiang sun created HBASE-19994:


 Summary: Create a new class for RPC throttling exception, make it 
retryable. 
 Key: HBASE-19994
 URL: https://issues.apache.org/jira/browse/HBASE-19994
 Project: HBase
  Issue Type: Improvement
Reporter: huaxiang sun


Based on a discussion at dev mailing list.

 
{code:java}

Thanks Andrew.



+1 for the second option, I will create a jira for this change.



Huaxiang

On Feb 9, 2018, at 1:09 PM, Andrew Purtell  wrote:

We have



public class ThrottlingException extends QuotaExceededException

public class QuotaExceededException extends DoNotRetryIOException



Let the storage quota limits throw QuotaExceededException directly (based

on DNRIOE). That seems fine.



However, ThrottlingException is thrown as a result of a temporal quota,

so it is inappropriate for this to inherit from DNRIOE, it should inherit

IOException instead so the client is allowed to retry until successful, or

until the retry policy is exhausted.



We are in a bit of a pickle because we've released with this inheritance

hierarchy, so to change it we will need a new minor, or we will want to

deprecate ThrottlingException and use a new exception class instead, one

which does not inherit from DNRIOE.

On Feb 7, 2018, at 9:25 AM, Huaxiang Sun  wrote:



Hi Mike,



  You are right. For rpc throttling, definitely it is retryable. For storage 
quota, I think it will be fail faster (non-retryable).

  We probably need to separate these two types of exceptions, I will do some 
more research and follow up.



  Thanks,

  Huaxiang



On Feb 7, 2018, at 9:16 AM, Mike Drob  wrote:



I think, philosophically, there can be two kinds of QEE -



For throttling, we can retry. The quota is a temporal quota - you have done

too many operations this minute, please try again next minute and

everything will work.

For storage, we shouldn't retry. The quota is a fixed quote - you have

exceeded your allotted disk space, please do not try again until you have

remedied the situation.



Our current usage conflates the two, sometimes it is correct, sometimes not.



On Wed, Feb 7, 2018 at 11:00 AM, Huaxiang Sun  wrote:



Hi Stack,



 I run into a case that a mapreduce job in hive cannot finish because

it runs into a QEE.

I need to look into the hive mr task to see if QEE is not handled

correctly in hbase code or in hive code.



I am thinking that if  QEE is a retryable exception, then it should be

taken care of by the hbase code.

I will check more and report back.



Thanks,

Huaxiang



On Feb 7, 2018, at 8:23 AM, Stack  wrote:



QEE being a DNRIOE seems right on the face of it.



But if throttling, a DNRIOE is inappropriate. Where you seeing a QEE in a

throttling scenario Huaxiang?



Thanks,

S


{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19924) hbase rpc throttling does not work for multi() with request count rater.

2018-02-13 Thread huaxiang sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363146#comment-16363146
 ] 

huaxiang sun commented on HBASE-19924:
--

I tested the fix and it worked as expected. The client code needs to be updated 
a bit to handle ThrottlingException so the client will retry. Expect a new 
patch, thanks.

> hbase rpc throttling does not work for multi() with request count rater.
> 
>
> Key: HBASE-19924
> URL: https://issues.apache.org/jira/browse/HBASE-19924
> Project: HBase
>  Issue Type: Bug
>  Components: rpc
>Affects Versions: 1.2.6, 2.0
>Reporter: huaxiang sun
>Assignee: huaxiang sun
>Priority: Major
> Attachments: HBASE-19924-master-v001.patch
>
>
> Basically, rpc throttling does not work for request count based rater for 
> multi. for the following code, when it calls limiter's checkQuota(), 
> numWrites/numReads is lost.
> {code:java}
> @Override
> public void checkQuota(int numWrites, int numReads, int numScans) throws 
> ThrottlingException {
>   writeConsumed = estimateConsume(OperationType.MUTATE, numWrites, 100);
>   readConsumed = estimateConsume(OperationType.GET, numReads, 100);
>   readConsumed += estimateConsume(OperationType.SCAN, numScans, 1000);
>   writeAvailable = Long.MAX_VALUE;
>   readAvailable = Long.MAX_VALUE;
>   for (final QuotaLimiter limiter : limiters) {
> if (limiter.isBypass()) continue;
> limiter.checkQuota(writeConsumed, readConsumed);
> readAvailable = Math.min(readAvailable, limiter.getReadAvailable());
> writeAvailable = Math.min(writeAvailable, limiter.getWriteAvailable());
>   }
>   for (final QuotaLimiter limiter : limiters) {
> limiter.grabQuota(writeConsumed, readConsumed);
>   }
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-19993) Publish tests jar for hbase-zookeeper in bin tarball

2018-02-13 Thread Appy (JIRA)

Appy created HBASE-19993:


 Summary: Publish tests jar for hbase-zookeeper in bin tarball
 Key: HBASE-19993
 URL: https://issues.apache.org/jira/browse/HBASE-19993
 Project: HBase
  Issue Type: Bug
Reporter: Appy
Assignee: Appy


Since {{HBTU extends HBZKTU}} (such short forms! i know!), we need to publish 
hbase-zookeeper's tests jar too. Many IT tests use HBTU.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19992) Hole in namespace table assign

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19992:
--
Description: 
If the assign fails before it comes up in a Master initialization, the table 
will have been created and may even be marked ENABLED successfully, but on 
restart, we don't assign the table.

Manifest is:

{code}
2018-02-13 11:45:24,504 ERROR [master/ve0524:16000] master.HMaster: Failed to 
become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
  at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:916)
  at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Timedout 30ms waiting for namespace table 
to be assigned and enabled: ENABLED
  at 
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
  at 
org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1050)
  ... 4 more
2018-02-13 11:45:24,506 ERROR [master/ve0524:16000] master.HMaster: Master 
server abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.security.access.AccessController]
2018-02-13 11:45:24,506 ERROR [master/ve0524:16000] master.HMaster: * 
ABORTING master ve0524.halxg.cloudera.com,16000,1518550812400: Unhandled 
exception. Starting shutdown. *
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
  at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:916)
  at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
  at java.lang.Thread.run(Thread.java:748)  


  Caused by: java.io.IOException: 
Timedout 30ms waiting for namespace table to be assigned and enabled: 
ENABLED
  at 
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
  at 
org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)


 at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1050)
  ... 4 more
{code}

Last thing in log before Master crash was:

2018-02-13 11:34:17,084 INFO  [master/ve0524:16000] hbase.MetaTableAccessor: 
Updated table hbase:namespace state to ENABLED in META

There is no one doing an assign subsequent to initial create table.

  was:
If the assign fails before it comes up in a Master initialization, the table 
will have been created and may even be marked ENABLED successfully, but on 
restart, we don't assign the table.

Manifest is:

{code}
2018-02-13 11:45:24,504 ERROR [master/ve0524:16000] master.HMaster: Failed to 
become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
  at

[jira] [Created] (HBASE-19992) Hole in namespace table assign

2018-02-13 Thread stack (JIRA)

stack created HBASE-19992:
-

 Summary: Hole in namespace table assign
 Key: HBASE-19992
 URL: https://issues.apache.org/jira/browse/HBASE-19992
 Project: HBase
  Issue Type: Bug
Reporter: stack
Assignee: stack


If the assign fails before it comes up in a Master initialization, the table 
will have been created and may even be marked ENABLED successfully, but on 
restart, we don't assign the table.

Manifest is:

{code}
2018-02-13 11:45:24,504 ERROR [master/ve0524:16000] master.HMaster: Failed to 
become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
  at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:916)
  at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
  at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Timedout 30ms waiting for namespace table 
to be assigned and enabled: ENABLED
  at 
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
  at 
org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1050)
  ... 4 more
2018-02-13 11:45:24,506 ERROR [master/ve0524:16000] master.HMaster: Master 
server abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.security.access.AccessController]
2018-02-13 11:45:24,506 ERROR [master/ve0524:16000] master.HMaster: * 
ABORTING master ve0524.halxg.cloudera.com,16000,1518550812400: Unhandled 
exception. Starting shutdown. *
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl 
[FAILED] to be RUNNING, but the service has FAILED
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:345)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:291)
  at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1052)
  at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:916)
  at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2026)
  at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:555)
  at java.lang.Thread.run(Thread.java:748)  


  Caused by: java.io.IOException: 
Timedout 30ms waiting for namespace table to be assigned and enabled: 
ENABLED
  at 
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
  at 
org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:62)
  at 
org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:226)


 at 
org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1050)
  ... 4 more
{code}

Last thing in log before Master crash was:

2018-02-13 11:34:17,084 INFO  [master/ve0524:16000] hbase.MetaTableAccessor: 
Updated table hbase:namespace state to ENABLED in META

There is no one doing an assign.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly

2018-02-13 Thread Ben Lau (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-19989:

Attachment: (was: HBASE-19989.patch)

> READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
> --
>
> Key: HBASE-19989
> URL: https://issues.apache.org/jira/browse/HBASE-19989
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.4.1
>Reporter: Ben Lau
>Assignee: Ben Lau
>Priority: Major
> Attachments: HBASE-19989.patch
>
>
> Region state transitions do not work correctly for READY_TO_MERGE/SPLIT.  
> [~thiruvel] and I noticed this is due to break statements being in the wrong 
> place in AssignmentManager.  This allows a race condition for example in 
> which one of the regions being merged could be moved concurrently, resulting 
> in the merge transaction failing and then double assignment and/or dataloss.  
> This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not 
> branch-2 as the relevant code in AM has since been rewritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly

2018-02-13 Thread Ben Lau (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363081#comment-16363081
 ] 

Ben Lau commented on HBASE-19989:
-

Hi Ted, thanks for the feedback, I'm not sure a comment will be helpful since 
it comes down to 'if the break is here the code below doesn't run, so the break 
is not here' but I have added a comment anyway and re-added the ZKLess 
split/merge tests that were removed in branch-1.  Let me know your thoughts, 
thanks.

> READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
> --
>
> Key: HBASE-19989
> URL: https://issues.apache.org/jira/browse/HBASE-19989
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.4.1
>Reporter: Ben Lau
>Assignee: Ben Lau
>Priority: Major
> Attachments: HBASE-19989.patch, HBASE-19989.patch
>
>
> Region state transitions do not work correctly for READY_TO_MERGE/SPLIT.  
> [~thiruvel] and I noticed this is due to break statements being in the wrong 
> place in AssignmentManager.  This allows a race condition for example in 
> which one of the regions being merged could be moved concurrently, resulting 
> in the merge transaction failing and then double assignment and/or dataloss.  
> This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not 
> branch-2 as the relevant code in AM has since been rewritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly

2018-02-13 Thread Ben Lau (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Lau updated HBASE-19989:

Attachment: HBASE-19989.patch

> READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
> --
>
> Key: HBASE-19989
> URL: https://issues.apache.org/jira/browse/HBASE-19989
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.4.1
>Reporter: Ben Lau
>Assignee: Ben Lau
>Priority: Major
> Attachments: HBASE-19989.patch, HBASE-19989.patch
>
>
> Region state transitions do not work correctly for READY_TO_MERGE/SPLIT.  
> [~thiruvel] and I noticed this is due to break statements being in the wrong 
> place in AssignmentManager.  This allows a race condition for example in 
> which one of the regions being merged could be moved concurrently, resulting 
> in the merge transaction failing and then double assignment and/or dataloss.  
> This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not 
> branch-2 as the relevant code in AM has since been rewritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363052#comment-16363052
 ] 

stack commented on HBASE-19988:
---

Retry

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19988:
--
Attachment: hbase-19988.master.001.patch

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363051#comment-16363051
 ] 

Umesh Agashe commented on HBASE-19988:
--

Thanks [~stack]! Lets wait for what [~appy] has to say on this.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch, 
> hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363048#comment-16363048
 ] 

stack commented on HBASE-19988:
---

Thanks for explanation. +1 on patch then.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19876) The exception happening in converting pb mutation to hbase.mutation messes up the CellScanner

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363039#comment-16363039
 ] 

Hudson commented on HBASE-19876:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4579 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4579/])
HBASE-19876 The exception happening in converting pb mutation to (chia7712: rev 
2f48fdbb26ff555485b4aa3393d835b7dd8797a0)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMalformedCellFromClient.java
* (edit) 
hbase-client/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/RequestConverter.java


> The exception happening in converting pb mutation to hbase.mutation messes up 
> the CellScanner
> -
>
> Key: HBASE-19876
> URL: https://issues.apache.org/jira/browse/HBASE-19876
> Project: HBase
>  Issue Type: Bug
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 1.3.2, 1.5.0, 1.2.7, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19876.branch-1.2.v0.patch, 
> HBASE-19876.master.001.patch, HBASE-19876.v0.patch, HBASE-19876.v1.patch, 
> HBASE-19876.v2.patch, HBASE-19876.v3.patch, HBASE-19876.v3.patch, 
> HBASE-19876.v3.patch, HBASE-19876.v3.patch, HBASE-19876.v4.patch, 
> HBASE-19876.v5.patch, HBASE-19876.v6.patch
>
>
> {code:java}
> 2018-01-27 22:51:43,794 INFO  [hconnection-0x3291b443-shared-pool11-t6] 
> client.AsyncRequestFutureImpl(778): id=5, table=testQuotaStatusFromMaster3, 
> attempt=6/16 failed=20ops, last 
> exception=org.apache.hadoop.hbase.client.WrongRowIOException: 
> org.apache.hadoop.hbase.client.WrongRowIOException: The row in xxx doesn't 
> match the original one aaa
>   at org.apache.hadoop.hbase.client.Mutation.add(Mutation.java:776)
>   at org.apache.hadoop.hbase.client.Put.add(Put.java:282)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toPut(ProtobufUtil.java:642)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:952)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:896)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2591)
>   at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41560)
>   at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:404)
>   at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
>   at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304){code}
> I noticed this bug when testing the table space quota.
> When rs are converting pb mutation to hbase.mutation, the quota exception or 
> cell exception may be thrown.
> {code}
> Unable to find source-code formatter for language: 
> rsrpcservices#dobatchop.java. Available languages are: actionscript, ada, 
> applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, 
> java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, 
> rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml  for 
> (ClientProtos.Action action: mutations) {
> MutationProto m = action.getMutation();
> Mutation mutation;
> if (m.getMutateType() == MutationType.PUT) {
>   mutation = ProtobufUtil.toPut(m, cells);
>   batchContainsPuts = true;
> } else {
>   mutation = ProtobufUtil.toDelete(m, cells);
>   batchContainsDelete = true;
> }
> mutationActionMap.put(mutation, action);
> mArray[i++] = mutation;
> checkCellSizeLimit(region, mutation);
> // Check if a space quota disallows this mutation
> spaceQuotaEnforcement.getPolicyEnforcement(region).check(mutation);
> quota.addMutation(mutation);
>   }
> {code}
> rs has caught the exception but it doesn't have the cellscanner skip the 
> failed cells.
> {code:java}
> } catch (IOException ie) {
>   if (atomic) {
> throw ie;
>   }
>   for (Action mutation : mutations) {
> builder.addResultOrException(getResultOrException(ie, 
> mutation.getIndex()));
>   }
> }
> {code}
> The bug results in the WrongRowIOException to remaining mutations since they 
> refer to invalid cells.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19844) Shell should support flush by regionserver

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363041#comment-16363041
 ] 

Hudson commented on HBASE-19844:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4579 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4579/])
HBASE-19844 Shell should support to flush by regionserver (tedyu: rev 
8e8e1e5a1bbb240a6f4a71bc8b0271d31da633b3)
* (edit) hbase-shell/src/main/ruby/shell/commands/flush.rb
* (edit) hbase-shell/src/test/ruby/hbase/admin_test.rb
* (edit) hbase-shell/src/main/ruby/hbase/admin.rb


> Shell should support flush by regionserver
> --
>
> Key: HBASE-19844
> URL: https://issues.apache.org/jira/browse/HBASE-19844
> Project: HBase
>  Issue Type: New Feature
>  Components: shell
>Reporter: Chia-Ping Tsai
>Assignee: Reid Chan
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19844.master.001.patch, 
> HBASE-19844.master.002.patch, HBASE-19844.master.003.patch, 
> HBASE-19844.master.004.patch
>
>
> HBASE-4224 add a method to admin that can do the flush by regionserver. As 
> with other Admin methods, we should enable shell to use the flush method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19970) Remove unused functions from TableAuthManager

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363040#comment-16363040
 ] 

Hudson commented on HBASE-19970:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4579 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4579/])
Revert "HBASE-19970 Remove unused functions from TableAuthManager." (stack: rev 
ba402b1e7b446144d4d20f90cb71e6aa19ecce3c)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/TableAuthManager.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestZKPermissionWatcher.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestTablePermissions.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessControlLists.java


> Remove unused functions from TableAuthManager
> -
>
> Key: HBASE-19970
> URL: https://issues.apache.org/jira/browse/HBASE-19970
> Project: HBase
>  Issue Type: Task
>Reporter: Appy
>Assignee: Appy
>Priority: Minor
> Fix For: 1.5.0, 2.0.0-beta-2
>
> Attachments: HBASE-19970.master.001.patch
>
>
> Functions deleted in TableAuthManager:
> - setTableUserPermissions
> - setTableGroupPermissions
> - setNamespaceUserPermissions
> - setNamespaceGroupPermissions
> - writeTableToZooKeeper
> - writeNamespaceToZooKeeper
> To make sure it was not a bug, and that relevant functionality moved to some 
> alternate code path, tried to find out why and when these functions went out 
> of use. But just couldn't figure out...until i reached the patch which added 
> them. Looks like they were dead functions to start with :)
> Jira which added them: HBASE-8409. Commit id: 
> ac10b3c13d6b66e12d0c9601204b01dfa525ed19



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363042#comment-16363042
 ] 

Hudson commented on HBASE-19979:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #4579 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/4579/])
HBASE-19979 ReplicationSyncUp tool may leak Zookeeper connection (stack: rev 
39e191e5598529c68007c96e69acdd923a294d33)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSyncUp.java


> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363010#comment-16363010
 ] 

Hadoop QA commented on HBASE-19116:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
37s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 8s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
7s{color} | {color:red} hbase-server: The patch generated 4 new + 18 unchanged 
- 2 fixed = 22 total (was 20) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
10s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}103m 
11s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:9f2f2db |
| JIRA Issue | HBASE-19116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910426/HBASE-19116.branch-2.001.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux be4b69301fc8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 4594f7156d |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11511/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11511/testReport/ |
| Max. process+thread count | 4974 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11511/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This

[jira] [Commented] (HBASE-19988) HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while waiting for a row lock

2018-02-13 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362989#comment-16362989
 ] 

Umesh Agashe commented on HBASE-19988:
--

It was logging following exception... several times!
{code:java}
2018-02-10 04:24:25,503 WARN [PutThread] regionserver.HRegion(5636): Thread 
interrupted waiting for lock on row: row0
2018-02-10 04:24:25,503 WARN [PutThread] 
regionserver.HRegion$BatchOperation(3173): Failed getting lock, row=row0
java.io.InterruptedIOException
at 
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5637)
at 
org.apache.hadoop.hbase.regionserver.HRegion$BatchOperation.lockRowsAndBuildMiniBatch(HRegion.java:3168)
at 
org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3837)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3810)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3741)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3732)
at org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3746)
at org.apache.hadoop.hbase.regionserver.HRegion.doBatchMutate(HRegion.java:4074)
at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2925)
at 
org.apache.hadoop.hbase.regionserver.TestHRegion$PutThread.run(TestHRegion.java:3891)
Caused by: java.lang.InterruptedException
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:871)
at 
org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5621)
... 9 more{code}
 

There is a loop in the write batch path:
{code:java}
while (!batchOp.isDone()) {
  doMiniBatchMutate(batchOp);
}{code}
 

This loop essentially, tries to acquire locks on as many rows in a batch as 
possible and creates a mini-batch of those rows to write. Next time, locks are 
acquired from last row (row for which previous iteration failed to acquire a 
lock) on till the entire batch is written.

The operation was aborted/ stopped only on Timeout exception. All other 
exceptions were logged and ignored to resume creating and writing mini-batches 
for an input batch.

In this particular case, getRowLockInternal() used to fail with exception 
InterruptedIOException caused by surefire (possibly due to test timeout). This 
exception was ignored to proceed with write operation containing locked rows so 
far. This was causing continuous calls to doMinibatchMutate() in a loop, 
filling up the logs.

> HRegion#lockRowsAndBuildMiniBatch() is too chatty when interrupted while 
> waiting for a row lock
> ---
>
> Key: HBASE-19988
> URL: https://issues.apache.org/jira/browse/HBASE-19988
> Project: HBase
>  Issue Type: Improvement
>  Components: amv2
>Affects Versions: 2.0.0-beta-1
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
>Priority: Minor
> Fix For: 2.0.0-beta-2
>
> Attachments: hbase-19988.master.001.patch
>
>
> See HBASE-19970, TestHRegionWithInMemoryFlush created 4.2g log file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HBASE-19767) Master web UI shows negative values for Remaining KVs

2018-02-13 Thread Umesh Agashe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Umesh Agashe reassigned HBASE-19767:


Assignee: Umesh Agashe

> Master web UI shows negative values for Remaining KVs
> -
>
> Key: HBASE-19767
> URL: https://issues.apache.org/jira/browse/HBASE-19767
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-4
>Reporter: Jean-Marc Spaggiari
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: Screen Shot 2018-01-12 at 12.18.41 PM.png
>
>
> In the Master Web UI, under the compaction tab, the Remaining KVs sometimes 
> shows negative values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19767) Master web UI shows negative values for Remaining KVs

2018-02-13 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362932#comment-16362932
 ] 

Umesh Agashe commented on HBASE-19767:
--

[~stack], I will pick this up.

> Master web UI shows negative values for Remaining KVs
> -
>
> Key: HBASE-19767
> URL: https://issues.apache.org/jira/browse/HBASE-19767
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-4
>Reporter: Jean-Marc Spaggiari
>Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: Screen Shot 2018-01-12 at 12.18.41 PM.png
>
>
> In the Master Web UI, under the compaction tab, the Remaining KVs sometimes 
> shows negative values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362929#comment-16362929
 ] 

stack commented on HBASE-19166:
---

Regards the description, a WALPlayer from hbase1 trying to read a hbase2 WAL, 
just use an hbase2 WALPlayer to do the job.

On an hbase1 splitting hbase2 logs and failing as per the above, that might be 
ok; it just means we need to add more RegionServers to the cluster of 
hbase2-type that can split the logs. Need to plan rolling upgrade. That'll tell 
us if we need this facility or not. Meantime moving out of beta-2.

> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19166:
--
Fix Version/s: (was: 2.0.0-beta-2)
   2.0.0

> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19166) Add translation for handling hbase.regionserver.wal.WALEdit

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362897#comment-16362897
 ] 

stack commented on HBASE-19166:
---

hbase1 complaint is now:

{code}
1134718 2018-02-13 10:43:57,589 DEBUG [RS_LOG_REPLAY_OPS-ve0530:16020-0] 
wal.WALSplitter: Finishing writing output logs and closing down.
1134719 2018-02-13 10:43:57,589 INFO  [RS_LOG_REPLAY_OPS-ve0530:16020-0] 
wal.WALSplitter: Processed 0 edits across 0 regions; edits skipped=0; log 
file=hdfs://ve0524.halxg.cloudera.com:8020/hbase/WALs/ve0534.halxg.cloudera.com,16020,1518546984742-splitting/ve0534.halxg.cloudera.com%2C1
6020%2C1518546984742.meta.1518546993545.meta, length=23982, 
corrupted=false, progress failed=false
1134720 2018-02-13 10:43:57,590 WARN  [RS_LOG_REPLAY_OPS-ve0530:16020-0] 
regionserver.SplitLogWorker: log splitting of 
WALs/ve0534.halxg.cloudera.com,16020,1518546984742-splitting/ve0534.halxg.cloudera.com%2C16020%2C1518546984742.meta.1518546993545.meta
 failed, returning error
1134721 java.io.IOException: Got unknown writer class: AsyncProtobufLogWriter
1134722   at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initInternal(ProtobufLogReader.java:220)
1134723   at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initReader(ProtobufLogReader.java:169)
1134724   at 
org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:66)
1134725   at 
org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:164)
1134726   at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
1134727   at 
org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:267)
1134728   at 
org.apache.hadoop.hbase.wal.WALSplitter.getReader(WALSplitter.java:853)
1134729   at 
org.apache.hadoop.hbase.wal.WALSplitter.getReader(WALSplitter.java:777)
1134730   at 
org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:298)
1134731   at 
org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:236)
1134732   at 
org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104)
1134733   at 
org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72)
1134734   at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
1134735   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
1134736   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
1134737   at java.lang.Thread.run(Thread.java:748)
{code}



> Add translation for handling hbase.regionserver.wal.WALEdit
> ---
>
> Key: HBASE-19166
> URL: https://issues.apache.org/jira/browse/HBASE-19166
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Reporter: Ted Yu
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-2
>
>
> For hlog generated by 1.x, using WALPlayer from hbase2 would result in:
> {code}
> 2017-11-02 21:22:40,907 INFO  [main] mapreduce.Job: Task Id : 
> attempt_1509641483571_0003_m_00_0, Status : FAILED
> Error: java.lang.ClassCastException: 
> org.apache.hadoop.hbase.regionserver.wal.WALEdit cannot be cast to 
> org.apache.hadoop.hbase.wal.WALEdit
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALCellMapper.map(WALPlayer.java:143)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
> {code}
> HBASE-16479 relocated WALEdit.
> Chatting with Enis, he mentioned adding translation for handling 
> hbase.regionserver.wal.WALEdit
> This way, WAL from 1.x can be recognized by hbase-2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly

2018-02-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362896#comment-16362896
 ] 

Ted Yu commented on HBASE-19989:


In the next patch, please add comment in the place of the previous break, 
explaining why the break is absent.

> READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
> --
>
> Key: HBASE-19989
> URL: https://issues.apache.org/jira/browse/HBASE-19989
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.4.1
>Reporter: Ben Lau
>Assignee: Ben Lau
>Priority: Major
> Attachments: HBASE-19989.patch
>
>
> Region state transitions do not work correctly for READY_TO_MERGE/SPLIT.  
> [~thiruvel] and I noticed this is due to break statements being in the wrong 
> place in AssignmentManager.  This allows a race condition for example in 
> which one of the regions being merged could be moved concurrently, resulting 
> in the merge transaction failing and then double assignment and/or dataloss.  
> This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not 
> branch-2 as the relevant code in AM has since been rewritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly

2018-02-13 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362886#comment-16362886
 ] 

Ted Yu commented on HBASE-19989:


Thanks for the update.

Happy New Year, Francis and Ben.

> READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
> --
>
> Key: HBASE-19989
> URL: https://issues.apache.org/jira/browse/HBASE-19989
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.4.1
>Reporter: Ben Lau
>Assignee: Ben Lau
>Priority: Major
> Attachments: HBASE-19989.patch
>
>
> Region state transitions do not work correctly for READY_TO_MERGE/SPLIT.  
> [~thiruvel] and I noticed this is due to break statements being in the wrong 
> place in AssignmentManager.  This allows a race condition for example in 
> which one of the regions being merged could be moved concurrently, resulting 
> in the merge transaction failing and then double assignment and/or dataloss.  
> This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not 
> branch-2 as the relevant code in AM has since been rewritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19989) READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly

2018-02-13 Thread Francis Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362878#comment-16362878
 ] 

Francis Liu commented on HBASE-19989:
-

[~yuzhih...@gmail.com] This is a bug in zkless assignment there used to be 
tests but they were removed. We'll include the zkless split tests in this 
patch. We've already been running the tests and this patch in prod. We'll work 
on adding back the rest of the zkless tests as part of HBASE-14626.

> READY_TO_MERGE and READY_TO_SPLIT do not update region state correctly
> --
>
> Key: HBASE-19989
> URL: https://issues.apache.org/jira/browse/HBASE-19989
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1, 1.4.1
>Reporter: Ben Lau
>Assignee: Ben Lau
>Priority: Major
> Attachments: HBASE-19989.patch
>
>
> Region state transitions do not work correctly for READY_TO_MERGE/SPLIT.  
> [~thiruvel] and I noticed this is due to break statements being in the wrong 
> place in AssignmentManager.  This allows a race condition for example in 
> which one of the regions being merged could be moved concurrently, resulting 
> in the merge transaction failing and then double assignment and/or dataloss.  
> This bug appears to only affect branch-1 (for example 1.3 and 1.4) and not 
> branch-2 as the relevant code in AM has since been rewritten.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19116:
--
Status: Patch Available  (was: Open)

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19116:
--
Attachment: HBASE-19116.branch-2.001.patch

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19116.branch-2.001.patch
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (HBASE-19991) lots of hbase-rest test failures against hadoop 3

2018-02-13 Thread Mike Drob (JIRA)

Mike Drob created HBASE-19991:
-

 Summary: lots of hbase-rest test failures against hadoop 3
 Key: HBASE-19991
 URL: https://issues.apache.org/jira/browse/HBASE-19991
 Project: HBase
  Issue Type: Bug
  Components: REST, test
Reporter: Mike Drob
Assignee: Mike Drob
 Fix For: 2.0.0


mvn clean test -pl hbase-rest -Dhadoop.profile=3.0

[ERROR] Tests run: 106, Failures: 95, Errors: 8, Skipped: 1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19930) fix ImmutableMemStoreLAB#forceCopyOfBigCellInto

2018-02-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362743#comment-16362743
 ] 

Hadoop QA commented on HBASE-19930:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  6m 
 2s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 1 new + 16 unchanged 
- 0 fixed = 17 total (was 16) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 1s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
20m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}106m 
36s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19930 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12910401/HBASE-19930-V05.patch 
|
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 8ed3c1587fd0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / ba402b1e7b |
| maven | version: Apache Maven 3.5.2 
(138edd61fd100ec658bfa2d307c43b76940a5d7d; 2017-10-18T07:58:13Z) |
| Default Java | 1.8.0_151 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11510/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11510/testReport/ |
| Max. process+thread count | 4974 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/11510/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was

[jira] [Updated] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19979:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Nice one [~pankaj2461] Good find.  Pushed to branch-2 and master. Looks like 
[~yuzhih...@gmail.com] pushed to branch-1.4 and branch-1 (Again, please use 
--author param so you can accredit the patch properly Ted Yu).

> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HBASE-19979) ReplicationSyncUp tool may leak Zookeeper connection

2018-02-13 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19979:
--
Fix Version/s: 1.4.2

> ReplicationSyncUp tool may leak Zookeeper connection
> 
>
> Key: HBASE-19979
> URL: https://issues.apache.org/jira/browse/HBASE-19979
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Major
> Fix For: 1.5.0, 2.0.0-beta-2, 1.4.2
>
> Attachments: HBASE-19979-branch-1.001.patch, HBASE-19979.patch
>
>
> ReplicationSyncUp tool may leak Zookeeper connection in the following code 
> snippet,
> {code}
> try {
>   int numberOfOldSource = 1; // default wait once
>   while (numberOfOldSource > 0) {
> Thread.sleep(SLEEP_TIME);
> numberOfOldSource = manager.getOldSources().size();
>   }
> } catch (InterruptedException e) {
>   System.err.println("didn't wait long enough:" + e);
>   return (-1);
> }
> manager.join();
> zkw.close();
> {code}
> ZooKeeperWatcher will not be closed in case of InterruptedException.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-13 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362622#comment-16362622
 ] 

Josh Elser commented on HBASE-19953:


Let me take a look at that. Thanks for the pointer, sir.

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19953) Avoid calling post* hook when procedure fails

2018-02-13 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362607#comment-16362607
 ] 

stack commented on HBASE-19953:
---

Looking at related Master-side Operations, I see them take a latch in the 
NonceProcedureRunnable implementation. When latch is thrown, they call the post 
op. See enableTable, createTable, etc. This delete namespace should do similar? 
Later we should come back and get rid of all these latches (and then we'll have 
to figure how Observer can monitor Procedure).

> Avoid calling post* hook when procedure fails
> -
>
> Key: HBASE-19953
> URL: https://issues.apache.org/jira/browse/HBASE-19953
> Project: HBase
>  Issue Type: Bug
>  Components: master, proc-v2
>Reporter: Ramesh Mani
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 2.0.0-beta-2
>
>
> Ramesh pointed out a case where I think we're mishandling some post\* 
> MasterObserver hooks. Specifically, I'm looking at the deleteNamespace.
> We synchronously execute the DeleteNamespace procedure. When the user 
> provides a namespace that isn't empty, the procedure does a rollback (which 
> is just a no-op), but this doesn't propagate an exception up to the 
> NonceProcedureRunnable in {{HMaster#deleteNamespace}}. It took Ramesh 
> pointing it out a bit better to me that the code executes a bit differently 
> than we actually expect.
> I think we need to double-check our post hooks and make sure we aren't 
> invoking them when the procedure actually failed. cc/ [~Apache9], [~stack].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 >

1 - 100 of 170 matches

Mail list logo