[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21388:
---
Attachment: HBASE-21388.master.002.patch

> No need to instantiate MemStoreLAB for master which not carry table
> ---
>
> Key: HBASE-21388
> URL: https://issues.apache.org/jira/browse/HBASE-21388
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21388.master.001.patch, 
> HBASE-21388.master.002.patch
>
>
> We found this log in our master.
> 2018-10-26,10:00:00,449 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
> MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
> 2018-10-26,10:00:00,452 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
> MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0
>  
> Same with HBASE-21290, we don't need to instantiate MemStore for master which 
> not carry table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-10-28 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1744#comment-1744
 ] 

Hudson commented on HBASE-20952:


Results for branch HBASE-20952
[build #32 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-28 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21395:
---
Attachment: HBASE-21395.branch-2.0.004.patch

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: HBASE-21395.branch-2.0.001.patch, 
> HBASE-21395.branch-2.0.002.patch, HBASE-21395.branch-2.0.003.patch, 
> HBASE-21395.branch-2.0.004.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1724#comment-1724
 ] 

Jingyun Tian edited comment on HBASE-19121 at 10/29/18 4:05 AM:


Sounds like we need to get regions of all problematic states for all tables to 
get a full list? I think add a tab to the navigator bar to dump the RIT as a 
table and can be viewed as txt could be easier to use?
Maybe I can make a demo and then we can compare which is better.


was (Author: tianjingyun):
Sounds like we need to get regions of all problematic states for all tables to 
get a full list? I think add a tab to the navigator bar to dump the RIT as a 
table and can be viewed as txt could be easier to use?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1724#comment-1724
 ] 

Jingyun Tian commented on HBASE-19121:
--

Sounds like we need to get regions of all problematic states for all tables to 
get a full list? I think add a tab to the navigator bar to dump the RIT as a 
table and can be viewed as txt could be easier to use?

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table

2018-10-28 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1712#comment-1712
 ] 

Guanghao Zhang commented on HBASE-21388:


The MemStoreLAB and ChunkCreator is global per process. The minicluster start 
master and rs in same process, the ut will failed as rs will create a new 
MemStoreLAB. Will find a way to resolve this later.

> No need to instantiate MemStoreLAB for master which not carry table
> ---
>
> Key: HBASE-21388
> URL: https://issues.apache.org/jira/browse/HBASE-21388
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21388.master.001.patch
>
>
> We found this log in our master.
> 2018-10-26,10:00:00,449 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
> MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
> 2018-10-26,10:00:00,452 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
> MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0
>  
> Same with HBASE-21290, we don't need to instantiate MemStore for master which 
> not carry table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21388:
---
Attachment: HBASE-21388.master.001.patch

> No need to instantiate MemStoreLAB for master which not carry table
> ---
>
> Key: HBASE-21388
> URL: https://issues.apache.org/jira/browse/HBASE-21388
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21388.master.001.patch
>
>
> We found this log in our master.
> 2018-10-26,10:00:00,449 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
> MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
> 2018-10-26,10:00:00,452 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
> MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0
>  
> Same with HBASE-21290, we don't need to instantiate MemStore for master which 
> not carry table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21388:
---
Status: Patch Available  (was: Open)

> No need to instantiate MemStoreLAB for master which not carry table
> ---
>
> Key: HBASE-21388
> URL: https://issues.apache.org/jira/browse/HBASE-21388
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21388.master.001.patch
>
>
> We found this log in our master.
> 2018-10-26,10:00:00,449 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
> MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
> 2018-10-26,10:00:00,452 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
> MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0
>  
> Same with HBASE-21290, we don't need to instantiate MemStore for master which 
> not carry table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reassigned HBASE-21388:
--

Assignee: Guanghao Zhang

> No need to instantiate MemStoreLAB for master which not carry table
> ---
>
> Key: HBASE-21388
> URL: https://issues.apache.org/jira/browse/HBASE-21388
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21388.master.001.patch
>
>
> We found this log in our master.
> 2018-10-26,10:00:00,449 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
> MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
> 2018-10-26,10:00:00,452 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
> MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0
>  
> Same with HBASE-21290, we don't need to instantiate MemStore for master which 
> not carry table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21388:
---
Summary: No need to instantiate MemStoreLAB for master which not carry 
table  (was: No need to instantiate MemStore for master which not carry table)

> No need to instantiate MemStoreLAB for master which not carry table
> ---
>
> Key: HBASE-21388
> URL: https://issues.apache.org/jira/browse/HBASE-21388
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Priority: Major
>
> We found this log in our master.
> 2018-10-26,10:00:00,449 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data 
> MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0
> 2018-10-26,10:00:00,452 INFO 
> [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] 
> org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index 
> MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0
>  
> Same with HBASE-21290, we don't need to instantiate MemStore for master which 
> not carry table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1702#comment-1702
 ] 

stack commented on HBASE-19121:
---

Its as you state, if in UI, it is always available to the operator but yeah, if 
UI is not up, then operator is stuck. Perhaps we work on making sure UI is 
always available?

I was thinking that operator could click on the UI in the tables panel on the 
OPENING count and get a page that listed all the regions in OPENING. Then same 
for OPEN, CLOSED, CLOSING? I made a start a while back but didn't get far. 
Would be useful for operator. Could use curl or wget or lynx to get the list.


> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)

2018-10-28 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1689#comment-1689
 ] 

Jingyun Tian commented on HBASE-19121:
--

[~stack] planning to build 2 tools as your doc already mentioned:
# dump a list of stuck procedures as txt.
# dump a list of RIT as txt.

Should we build these tools in Master UI or Canary tools? 
If we build this in Master UI, it's easier for operator to use. But if Master 
UI is not up, it's unavailable (This situation should be rare?). Or build them 
in Canary tools? 
Please let me know your thoughts.

> HBCK for AMv2 (A.K.A HBCK2)
> ---
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Umbrella
>  Components: hbck, hbck2
>Reporter: stack
>Assignee: Umesh Agashe
>Priority: Major
> Fix For: hbck2-1.0.0
>
> Attachments: hbase-19121.master.001.patch
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21374) Backport HBASE-21342 to branch-1

2018-10-28 Thread mazhenlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mazhenlin updated HBASE-21374:
--
Status: Patch Available  (was: Open)

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: mazhenlin
>Priority: Major
> Attachments: HBASE-21374.branch-1.001.patch, 
> HBASE-21374.branch-1.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21374) Backport HBASE-21342 to branch-1

2018-10-28 Thread mazhenlin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mazhenlin updated HBASE-21374:
--
Attachment: HBASE-21374.branch-1.002.patch

> Backport HBASE-21342 to branch-1
> 
>
> Key: HBASE-21374
> URL: https://issues.apache.org/jira/browse/HBASE-21374
> Project: HBase
>  Issue Type: Task
>Reporter: Mike Drob
>Assignee: mazhenlin
>Priority: Major
> Attachments: HBASE-21374.branch-1.001.patch, 
> HBASE-21374.branch-1.002.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21401) Sanity check in BaseDecoder#parseCell

2018-10-28 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21401:
--
Component/s: regionserver

> Sanity check in BaseDecoder#parseCell
> -
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-213,  it's hard to debug this kind 
> of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21401) Sanity check in BaseDecoder#parseCell

2018-10-28 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21401:
--
Priority: Critical  (was: Major)

> Sanity check in BaseDecoder#parseCell
> -
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-213,  it's hard to debug this kind 
> of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check in BaseDecoder#parseCell

2018-10-28 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1671#comment-1671
 ] 

Zheng Hu commented on HBASE-21401:
--

For UT TestMobDataBlockEncoding#testDataBlockEncoding,  all encoding except 
ROW_INDEX_V1 works fine, need to found out what's wrong with the ROW_INDEX_V1...

> Sanity check in BaseDecoder#parseCell
> -
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-213,  it's hard to debug this kind 
> of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-28 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1669#comment-1669
 ] 

Guanghao Zhang commented on HBASE-21325:


Pushed to master and branch-2. Thanks [~Apache9] for reviewing.

And ping [~stack] for branch-2.1 and branch-2.0. Reopen this to backport to 
branch-2.1 and branch-2.0 after you released 2.1.1. Thanks.

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21325:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21325:
---
Affects Version/s: 2.1.1
   2.2.0
   3.0.0
   2.0.2

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21325:
---
Fix Version/s: 2.2.0
   3.0.0

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-28 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21325:
---
Release Note: Add two new config hbase.regionserver.abort.timeout and 
hbase.regionserver.abort.timeout.task. If regionserver abort timeout, it will 
schedule an abort timeout task to run. The default abort task is 
SystemExitWhenAbortTimeout, which will force to terminate region server when 
abort timeout. And you can config a special abort timeout task by 
hbase.regionserver.abort.timeout.task.

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check in BaseDecoder#parseCell

2018-10-28 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1653#comment-1653
 ] 

Zheng Hu commented on HBASE-21401:
--

Still some UT failed, such as: 
1. TestTags#testFlushAndCompactionwithCombinations
2. TestMobDataBlockEncoding#testDataBlockEncoding

The timeout TestAsyncQuotaAdminApi & 
TestReplicationSyncUpToolWithMultipleAsyncWAL  has no relationship with this 
issue. 

> Sanity check in BaseDecoder#parseCell
> -
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-213,  it's hard to debug this kind 
> of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-28 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1648#comment-1648
 ] 

Allan Yang commented on HBASE-21375:


My mistake, I misunderstood the behave, the worker will find one executable 
procedure and execute it, leaving others in the queue to other workers. Then, 
no other concern, +1 for the patch

> Revisit the lock and queue implementation in MasterProcedureScheduler
> -
>
> Key: HBASE-21375
> URL: https://issues.apache.org/jira/browse/HBASE-21375
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, 
> HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch
>
>
> The problem for the old implementation is that we will only check the first 
> procedure in a queue to see if it could run, if it can not, we will remove 
> the queue from run queue. So when adding procedure to the scheduler, we have 
> to try hard to put the procedure which can be executed in front of the queue, 
> if there are corner cases where we fail to do so, it will likely lead to a 
> dead lock, that's why we have the tricky code when loading procedures and try 
> to add them into the scheduler, and also lots of 'if' in the doAdd method of 
> MasterProcedureScheduler. But this is still not enough to make things right, 
> so finally [~allan163] and I decided to change the logic in doPoll method, 
> where we use a loop to find whether there is a procedure can be executed, not 
> only the first one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-28 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1646#comment-1646
 ] 

Duo Zhang commented on HBASE-21375:
---

I do not think iterating the queue or not will effect whether a TableQueue can 
be executed by multiple workers or not? A worker will iterate the queue but 
finally it will poll a procedure and return, then other workers can still poll 
from the queue.

> Revisit the lock and queue implementation in MasterProcedureScheduler
> -
>
> Key: HBASE-21375
> URL: https://issues.apache.org/jira/browse/HBASE-21375
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, 
> HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch
>
>
> The problem for the old implementation is that we will only check the first 
> procedure in a queue to see if it could run, if it can not, we will remove 
> the queue from run queue. So when adding procedure to the scheduler, we have 
> to try hard to put the procedure which can be executed in front of the queue, 
> if there are corner cases where we fail to do so, it will likely lead to a 
> dead lock, that's why we have the tricky code when loading procedures and try 
> to add them into the scheduler, and also lots of 'if' in the doAdd method of 
> MasterProcedureScheduler. But this is still not enough to make things right, 
> so finally [~allan163] and I decided to change the logic in doPoll method, 
> where we use a loop to find whether there is a procedure can be executed, not 
> only the first one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21389) Revisit the procedure lock for sync replication

2018-10-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1438#comment-1438
 ] 

Hadoop QA commented on HBASE-21389:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
25s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
26s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  3s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}129m 
33s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}173m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21389 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945963/HBASE-21389-v1.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux fbde65b694c6 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 
17 11:07:07 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 7cdb525192 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14890/testReport/ |
| Max. process+thread count | 4609 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1417#comment-1417
 ] 

Hadoop QA commented on HBASE-21395:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
50s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
42s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 1s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
59s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}119m 
48s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21395 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945962/HBASE-21395.branch-2.0.003.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 094956d856f9 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2.0 / a3b2686114 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14889/testReport/ |
| Max. process+thread count | 4245 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-28 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1392#comment-1392
 ] 

Allan Yang commented on HBASE-21375:


I have a concern that before, one table's region operations can be executed by 
several workers concurrently, but now, since one worker will iterate over the 
TableQueue, it will execute the procedures one by one. If it is a very big 
table, the modify table maybe not tolerable.  

> Revisit the lock and queue implementation in MasterProcedureScheduler
> -
>
> Key: HBASE-21375
> URL: https://issues.apache.org/jira/browse/HBASE-21375
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, 
> HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch
>
>
> The problem for the old implementation is that we will only check the first 
> procedure in a queue to see if it could run, if it can not, we will remove 
> the queue from run queue. So when adding procedure to the scheduler, we have 
> to try hard to put the procedure which can be executed in front of the queue, 
> if there are corner cases where we fail to do so, it will likely lead to a 
> dead lock, that's why we have the tricky code when loading procedures and try 
> to add them into the scheduler, and also lots of 'if' in the doAdd method of 
> MasterProcedureScheduler. But this is still not enough to make things right, 
> so finally [~allan163] and I decided to change the logic in doPoll method, 
> where we use a loop to find whether there is a procedure can be executed, not 
> only the first one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check in BaseDecoder#parseCell

2018-10-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1388#comment-1388
 ] 

Hadoop QA commented on HBASE-21401:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
40s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
56s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 35s{color} 
| {color:red} hbase-common generated 2 new + 42 unchanged - 0 fixed = 44 total 
(was 42) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
30s{color} | {color:red} hbase-common: The patch generated 30 new + 148 
unchanged - 1 fixed = 178 total (was 149) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
14s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
41s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}233m 15s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}296m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncQuotaAdminApi |
|   | hadoop.hbase.regionserver.TestTags |
|   | hadoop.hbase.mob.TestMobDataBlockEncoding |
|   | 
hadoop.hbase.replication.multiwal.TestReplicationSyncUpToolWithMultipleAsyncWAL 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21401 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12945952/HBASE-21401.v2.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 0d9b0f7101ca 3.13.0-153-generic #203-Ubuntu 

[jira] [Updated] (HBASE-21389) Revisit the procedure lock for sync replication

2018-10-28 Thread Duo Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-21389:
--
Attachment: HBASE-21389-v1.patch

> Revisit the procedure lock for sync replication
> ---
>
> Key: HBASE-21389
> URL: https://issues.apache.org/jira/browse/HBASE-21389
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2, Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21389-v1.patch, HBASE-21389-v1.patch, 
> HBASE-21389.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler

2018-10-28 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1384#comment-1384
 ] 

Duo Zhang commented on HBASE-21375:
---

Any other concerns? [~stack] [~allan163] Thanks.

> Revisit the lock and queue implementation in MasterProcedureScheduler
> -
>
> Key: HBASE-21375
> URL: https://issues.apache.org/jira/browse/HBASE-21375
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, 
> HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch
>
>
> The problem for the old implementation is that we will only check the first 
> procedure in a queue to see if it could run, if it can not, we will remove 
> the queue from run queue. So when adding procedure to the scheduler, we have 
> to try hard to put the procedure which can be executed in front of the queue, 
> if there are corner cases where we fail to do so, it will likely lead to a 
> dead lock, that's why we have the tricky code when loading procedures and try 
> to add them into the scheduler, and also lots of 'if' in the doAdd method of 
> MasterProcedureScheduler. But this is still not enough to make things right, 
> so finally [~allan163] and I decided to change the logic in doPoll method, 
> where we use a loop to find whether there is a procedure can be executed, not 
> only the first one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere

2018-10-28 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1383#comment-1383
 ] 

Duo Zhang commented on HBASE-21325:
---

+1.

> Force to terminate regionserver when abort hang in somewhere
> 
>
> Key: HBASE-21325
> URL: https://issues.apache.org/jira/browse/HBASE-21325
> Project: HBase
>  Issue Type: Improvement
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Attachments: HBASE-21325.master.001.patch, 
> HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, 
> HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, 
> HBASE-21325.master.005.patch
>
>
> When testing sync replication, I found that, if I transit the remote cluster 
> to DA, while the local cluster is still in A, the region server will hang 
> when shutdown. As the fsOk flag only test the local cluster(which is 
> reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is 
> broken(the remote wal directory is gone)  so we will never succeed. And this 
> lead to an infinite wait inside waitOnAllRegionsToClose.
> So I think here we should have an upper bound for the wait time in 
> waitOnAllRegionsToClose method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on

2018-10-28 Thread Allan Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Yang updated HBASE-21395:
---
Attachment: HBASE-21395.branch-2.0.003.patch

> Abort split/merge procedure if there is a table procedure of the same table 
> going on
> 
>
> Key: HBASE-21395
> URL: https://issues.apache.org/jira/browse/HBASE-21395
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0, 2.0.2
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: HBASE-21395.branch-2.0.001.patch, 
> HBASE-21395.branch-2.0.002.patch, HBASE-21395.branch-2.0.003.patch
>
>
> In my ITBLL, I often see that if split/merge procedure and table 
> procedure(like ModifyTableProcedure) happen at the same time, and since there 
> some race conditions between these two kind of procedures,  causing some 
> serious problems. e.g. the split/merged parent is bought on line by the table 
> procedure or the split merged region making the whole table procedure 
> rollback.
> Talked with [~Apache9] offline today, this kind of problem was solved in 
> branch-2+ since There is a fence that only one RTSP can agianst a single 
> region at the same time.
> To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe 
> fence in the split/merge procedure: If there is a table procedure going on 
> against the same table, then abort the split/merge procedure. Aborting the 
> split/merge procedure at the beginning of the execution is no big deal, 
> compared with the mess it will cause...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21401) Sanity check in BaseDecoder#parseCell

2018-10-28 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-21401:
-
Attachment: HBASE-21401.v2.patch

> Sanity check in BaseDecoder#parseCell
> -
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-213,  it's hard to debug this kind 
> of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)