[jira] [Commented] (HBASE-21616) Port HBASE-21034 (Add new throttle type: read/write capacity unit) to branch-1

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724721#comment-16724721
 ] 

Zheng Hu commented on HBASE-21616:
--

[~apurtell], please notice that the HBASE-21578 need backport too, it's a minor 
fix. 

> Port HBASE-21034 (Add new throttle type: read/write capacity unit) to branch-1
> --
>
> Key: HBASE-21616
> URL: https://issues.apache.org/jira/browse/HBASE-21616
> Project: HBase
>  Issue Type: Task
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
>
> Port HBASE-21034 (Add new throttle type: read/write capacity unit) to branch-1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21565:
-
Attachment: HBASE-21565.branch-2.002.patch

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.branch-2.001.patch, 
> HBASE-21565.branch-2.002.patch, HBASE-21565.master.001.patch, 
> HBASE-21565.master.002.patch, HBASE-21565.master.003.patch, 
> HBASE-21565.master.004.patch, HBASE-21565.master.005.patch, 
> HBASE-21565.master.006.patch, HBASE-21565.master.007.patch, 
> HBASE-21565.master.008.patch, HBASE-21565.master.009.patch, 
> HBASE-21565.master.010.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache

2018-12-18 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21498:
---
Issue Type: Bug  (was: Improvement)

> Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a 
> new BlockCache
> --
>
> Key: HBASE-21498
> URL: https://issues.apache.org/jira/browse/HBASE-21498
> Project: HBase
>  Issue Type: Bug
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21498.master.001.patch, 
> HBASE-21498.master.002.patch, HBASE-21498.master.003.patch, 
> HBASE-21498.master.004.patch, HBASE-21498.master.005.patch, 
> HBASE-21498.master.006.patch, HBASE-21498.master.006.patch, 
> HBASE-21498.master.007.patch, HBASE-21498.master.007.patch
>
>
> In our cluster, we use a small heap/offheap config for master. After 
> HBASE-21290, master doesn't instantiate BlockCache when it not carry table. 
> But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles 
> method. And it will instantiate a new BlockCache if it not initialized before 
> and make master OOM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21565:
-
Attachment: (was: HBASE-21565.branch-2.002.patch)

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.branch-2.001.patch, 
> HBASE-21565.master.001.patch, HBASE-21565.master.002.patch, 
> HBASE-21565.master.003.patch, HBASE-21565.master.004.patch, 
> HBASE-21565.master.005.patch, HBASE-21565.master.006.patch, 
> HBASE-21565.master.007.patch, HBASE-21565.master.008.patch, 
> HBASE-21565.master.009.patch, HBASE-21565.master.010.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache

2018-12-18 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang resolved HBASE-21498.

   Resolution: Fixed
Fix Version/s: 2.0.4
   2.1.2

Pushed to branch-2.1 and branch-2.0. Thanks [~stack] for reviewing.

> Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a 
> new BlockCache
> --
>
> Key: HBASE-21498
> URL: https://issues.apache.org/jira/browse/HBASE-21498
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.2, 2.0.4
>
> Attachments: HBASE-21498.master.001.patch, 
> HBASE-21498.master.002.patch, HBASE-21498.master.003.patch, 
> HBASE-21498.master.004.patch, HBASE-21498.master.005.patch, 
> HBASE-21498.master.006.patch, HBASE-21498.master.006.patch, 
> HBASE-21498.master.007.patch, HBASE-21498.master.007.patch
>
>
> In our cluster, we use a small heap/offheap config for master. After 
> HBASE-21290, master doesn't instantiate BlockCache when it not carry table. 
> But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles 
> method. And it will instantiate a new BlockCache if it not initialized before 
> and make master OOM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21535) Zombie Master detector is not working

2018-12-18 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724718#comment-16724718
 ] 

Pankaj Kumar edited comment on HBASE-21535 at 12/19/18 6:19 AM:


Thank you [~stack] Sir. This fix is applicable for branch-2/2.1/2.0 as well. 
Have already attached HBASE-21535.branch-2.patch, kindly apply 
HBASE-21535.branch-2.patch.


was (Author: pankaj2461):
Thank you Stack. This fix is applicable for branch-2/2.1/2.0 as well. Have 
already attached HBASE-21535.branch-2.patch, kindly apply 
HBASE-21535.branch-2.patch.

> Zombie Master detector is not working
> -
>
> Key: HBASE-21535
> URL: https://issues.apache.org/jira/browse/HBASE-21535
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21535.branch-2.patch, HBASE-21535.branch-2.patch, 
> HBASE-21535.patch, HBASE-21535.v2.patch
>
>
> We have InitializationMonitor thread in HMaster which detects Zombie Hmaster 
> based on _hbase.master.initializationmonitor.timeout _and halts if 
> _hbase.master.initializationmonitor.haltontimeout_ set _true_.
> After HBASE-19694, HMaster initialization order was correted. Hmaster is set 
> active after Initializing ZK system trackers as follows,
> {noformat}
>  status.setStatus("Initializing ZK system trackers");
>  initializeZKBasedSystemTrackers();
>  status.setStatus("Loading last flushed sequence id of regions");
>  try {
>  this.serverManager.loadLastFlushedSequenceIds();
>  } catch (IOException e) {
>  LOG.debug("Failed to load last flushed sequence id of regions"
>  + " from file system", e);
>  }
>  // Set ourselves as active Master now our claim has succeeded up in zk.
>  this.activeMaster = true;
> {noformat}
> But Zombie detector thread is started at the begining phase of 
> finishActiveMasterInitialization(),
> {noformat}
>  private void finishActiveMasterInitialization(MonitoredTask status) throws 
> IOException,
>  InterruptedException, KeeperException, ReplicationException {
>  Thread zombieDetector = new Thread(new InitializationMonitor(this),
>  "ActiveMasterInitializationMonitor-" + System.currentTimeMillis());
>  zombieDetector.setDaemon(true);
>  zombieDetector.start();
> {noformat}
> During zombieDetector execution "master.isActiveMaster()" will be false, so 
> it won't wait and cant detect zombie master.
> {noformat}
>  @Override
>  public void run() {
>  try {
>  while (!master.isStopped() && master.isActiveMaster()) {
>  Thread.sleep(timeout);
>  if (master.isInitialized()) {
>  LOG.debug("Initialization completed within allotted tolerance. Monitor 
> exiting.");
>  } else {
>  LOG.error("Master failed to complete initialization after " + timeout + "ms. 
> Please"
>  + " consider submitting a bug report including a thread dump of this 
> process.");
>  if (haltOnTimeout) {
>  LOG.error("Zombie Master exiting. Thread dump to stdout");
>  Threads.printThreadInfo(System.out, "Zombie HMaster");
>  System.exit(-1);
>  }
>  }
>  }
>  } catch (InterruptedException ie) {
>  LOG.trace("InitMonitor thread interrupted. Existing.");
>  }
>  }
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21535) Zombie Master detector is not working

2018-12-18 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724718#comment-16724718
 ] 

Pankaj Kumar commented on HBASE-21535:
--

Thank you Stack. This fix is applicable for branch-2/2.1/2.0 as well. Have 
already attached HBASE-21535.branch-2.patch, kindly apply 
HBASE-21535.branch-2.patch.

> Zombie Master detector is not working
> -
>
> Key: HBASE-21535
> URL: https://issues.apache.org/jira/browse/HBASE-21535
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21535.branch-2.patch, HBASE-21535.branch-2.patch, 
> HBASE-21535.patch, HBASE-21535.v2.patch
>
>
> We have InitializationMonitor thread in HMaster which detects Zombie Hmaster 
> based on _hbase.master.initializationmonitor.timeout _and halts if 
> _hbase.master.initializationmonitor.haltontimeout_ set _true_.
> After HBASE-19694, HMaster initialization order was correted. Hmaster is set 
> active after Initializing ZK system trackers as follows,
> {noformat}
>  status.setStatus("Initializing ZK system trackers");
>  initializeZKBasedSystemTrackers();
>  status.setStatus("Loading last flushed sequence id of regions");
>  try {
>  this.serverManager.loadLastFlushedSequenceIds();
>  } catch (IOException e) {
>  LOG.debug("Failed to load last flushed sequence id of regions"
>  + " from file system", e);
>  }
>  // Set ourselves as active Master now our claim has succeeded up in zk.
>  this.activeMaster = true;
> {noformat}
> But Zombie detector thread is started at the begining phase of 
> finishActiveMasterInitialization(),
> {noformat}
>  private void finishActiveMasterInitialization(MonitoredTask status) throws 
> IOException,
>  InterruptedException, KeeperException, ReplicationException {
>  Thread zombieDetector = new Thread(new InitializationMonitor(this),
>  "ActiveMasterInitializationMonitor-" + System.currentTimeMillis());
>  zombieDetector.setDaemon(true);
>  zombieDetector.start();
> {noformat}
> During zombieDetector execution "master.isActiveMaster()" will be false, so 
> it won't wait and cant detect zombie master.
> {noformat}
>  @Override
>  public void run() {
>  try {
>  while (!master.isStopped() && master.isActiveMaster()) {
>  Thread.sleep(timeout);
>  if (master.isInitialized()) {
>  LOG.debug("Initialization completed within allotted tolerance. Monitor 
> exiting.");
>  } else {
>  LOG.error("Master failed to complete initialization after " + timeout + "ms. 
> Please"
>  + " consider submitting a bug report including a thread dump of this 
> process.");
>  if (haltOnTimeout) {
>  LOG.error("Zombie Master exiting. Thread dump to stdout");
>  Threads.printThreadInfo(System.out, "Zombie HMaster");
>  System.exit(-1);
>  }
>  }
>  }
>  } catch (InterruptedException ie) {
>  LOG.trace("InitMonitor thread interrupted. Existing.");
>  }
>  }
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21514:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Pushed to master and branch-2. Thanks all for reviewing.

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724709#comment-16724709
 ] 

Hadoop QA commented on HBASE-21565:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
27s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
56s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
46s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
45s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 52s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 18s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21565 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952273/HBASE-21565.branch-2.002.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 130dd0b19c45 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / fc7ca8a2ef |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15323/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15323/testReport/ |
| Max. process+thread count | 4256 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15323/console 

[jira] [Updated] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang updated HBASE-21514:
---
Fix Version/s: 2.2.0

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check when constructing the KeyValue

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724706#comment-16724706
 ] 

Zheng Hu commented on HBASE-21401:
--

bq. KeyValue does checks of lengths to make sure they are not insane (e.g. 
KeyValue#checkParameters). How much overlap between your check and these? Will 
we be doing double checking? Thanks.

Good question, I checked the lengths checking, such as 
KeyValue#createEmptyByteArray,  KeyValue#checkParameters ...  No overlap 
between my check and those checks, I think.  becase my sanity check only works 
when we construct an KeyValue from an complete byte[] (no other params),  we 
use this kind of constructor because we read bytes from socket or wal or hfile. 

the existed length check only works when we construct kv from the given row, 
cf, cq, ts, type and so on.  we use this because we have known the 
cf/cq/ts/type etc (not read byte stream from IO),  and construct kv to do 
further work (such as optimized seek in sever side or construct the client side 
request).  So, in theory, no overlap between them. Thanks

> Sanity check when constructing the KeyValue
> ---
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch, 
> HBASE-21401.v3.patch, HBASE-21401.v4.patch, HBASE-21401.v4.patch, 
> HBASE-21401.v5.patch, HBASE-21401.v6.patch, HBASE-21401.v7.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-21379,  it's hard to debug this 
> kind of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21610) numOpenConnections metric is set to -1 when zero server channel exist

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724704#comment-16724704
 ] 

stack commented on HBASE-21610:
---

+1 on patch. Retrying to get a clean build

> numOpenConnections metric is set to -1 when zero server channel exist
> -
>
> Key: HBASE-21610
> URL: https://issues.apache.org/jira/browse/HBASE-21610
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21610.patch, HBASE-21610.patch, HBASE-21610.patch
>
>
> In NettyRpcServer, numOpenConnections metric is set to -1 when zero server 
> channel exist.
> {code}
> @Override
>  public int getNumOpenConnections() {
>  // allChannels also contains the server channel, so exclude that from the 
> count.
>  return allChannels.size() - 1;
>  }
> {code}
>  
>  We should not decrease the channel size by 1 when zero server channel exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21610) numOpenConnections metric is set to -1 when zero server channel exist

2018-12-18 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21610:
--
Attachment: HBASE-21610.patch

> numOpenConnections metric is set to -1 when zero server channel exist
> -
>
> Key: HBASE-21610
> URL: https://issues.apache.org/jira/browse/HBASE-21610
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21610.patch, HBASE-21610.patch, HBASE-21610.patch
>
>
> In NettyRpcServer, numOpenConnections metric is set to -1 when zero server 
> channel exist.
> {code}
> @Override
>  public int getNumOpenConnections() {
>  // allChannels also contains the server channel, so exclude that from the 
> count.
>  return allChannels.size() - 1;
>  }
> {code}
>  
>  We should not decrease the channel size by 1 when zero server channel exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache

2018-12-18 Thread Guanghao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guanghao Zhang reopened HBASE-21498:


Reopen for branch-2.0 and branch-2.1.

> Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a 
> new BlockCache
> --
>
> Key: HBASE-21498
> URL: https://issues.apache.org/jira/browse/HBASE-21498
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21498.master.001.patch, 
> HBASE-21498.master.002.patch, HBASE-21498.master.003.patch, 
> HBASE-21498.master.004.patch, HBASE-21498.master.005.patch, 
> HBASE-21498.master.006.patch, HBASE-21498.master.006.patch, 
> HBASE-21498.master.007.patch, HBASE-21498.master.007.patch
>
>
> In our cluster, we use a small heap/offheap config for master. After 
> HBASE-21290, master doesn't instantiate BlockCache when it not carry table. 
> But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles 
> method. And it will instantiate a new BlockCache if it not initialized before 
> and make master OOM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21535) Zombie Master detector is not working

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724702#comment-16724702
 ] 

stack commented on HBASE-21535:
---

I pushed to master branch. I tried cherry-pick back but the zombieMaster init 
is in a different location in branch-2.  You want to move its location in 
branch-2 or just leave stuff as is [~pankaj2461] ? Thanks!

> Zombie Master detector is not working
> -
>
> Key: HBASE-21535
> URL: https://issues.apache.org/jira/browse/HBASE-21535
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21535.branch-2.patch, HBASE-21535.branch-2.patch, 
> HBASE-21535.patch, HBASE-21535.v2.patch
>
>
> We have InitializationMonitor thread in HMaster which detects Zombie Hmaster 
> based on _hbase.master.initializationmonitor.timeout _and halts if 
> _hbase.master.initializationmonitor.haltontimeout_ set _true_.
> After HBASE-19694, HMaster initialization order was correted. Hmaster is set 
> active after Initializing ZK system trackers as follows,
> {noformat}
>  status.setStatus("Initializing ZK system trackers");
>  initializeZKBasedSystemTrackers();
>  status.setStatus("Loading last flushed sequence id of regions");
>  try {
>  this.serverManager.loadLastFlushedSequenceIds();
>  } catch (IOException e) {
>  LOG.debug("Failed to load last flushed sequence id of regions"
>  + " from file system", e);
>  }
>  // Set ourselves as active Master now our claim has succeeded up in zk.
>  this.activeMaster = true;
> {noformat}
> But Zombie detector thread is started at the begining phase of 
> finishActiveMasterInitialization(),
> {noformat}
>  private void finishActiveMasterInitialization(MonitoredTask status) throws 
> IOException,
>  InterruptedException, KeeperException, ReplicationException {
>  Thread zombieDetector = new Thread(new InitializationMonitor(this),
>  "ActiveMasterInitializationMonitor-" + System.currentTimeMillis());
>  zombieDetector.setDaemon(true);
>  zombieDetector.start();
> {noformat}
> During zombieDetector execution "master.isActiveMaster()" will be false, so 
> it won't wait and cant detect zombie master.
> {noformat}
>  @Override
>  public void run() {
>  try {
>  while (!master.isStopped() && master.isActiveMaster()) {
>  Thread.sleep(timeout);
>  if (master.isInitialized()) {
>  LOG.debug("Initialization completed within allotted tolerance. Monitor 
> exiting.");
>  } else {
>  LOG.error("Master failed to complete initialization after " + timeout + "ms. 
> Please"
>  + " consider submitting a bug report including a thread dump of this 
> process.");
>  if (haltOnTimeout) {
>  LOG.error("Zombie Master exiting. Thread dump to stdout");
>  Threads.printThreadInfo(System.out, "Zombie HMaster");
>  System.exit(-1);
>  }
>  }
>  }
>  } catch (InterruptedException ie) {
>  LOG.trace("InitMonitor thread interrupted. Existing.");
>  }
>  }
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20917) MetaTableMetrics#stop references uninitialized requestsMap for non-meta region

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724693#comment-16724693
 ] 

stack commented on HBASE-20917:
---

What should I do here [~busbey]? Or will I push it out of 2.0.4?

> MetaTableMetrics#stop references uninitialized requestsMap for non-meta region
> --
>
> Key: HBASE-20917
> URL: https://issues.apache.org/jira/browse/HBASE-20917
> Project: HBase
>  Issue Type: Bug
>  Components: meta, metrics
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 1.4.6, 2.2.0, 2.0.4
>
> Attachments: 20917.addendum, 20917.v1.txt, 20917.v2.txt
>
>
> I noticed the following in test output:
> {code}
> 2018-07-21 15:54:43,181 ERROR [RS_CLOSE_REGION-regionserver/172.17.5.4:0-1] 
> executor.EventHandler(186): Caught throwable while processing event 
> M_RS_CLOSE_REGION
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.coprocessor.MetaTableMetrics.stop(MetaTableMetrics.java:329)
>   at 
> org.apache.hadoop.hbase.coprocessor.BaseEnvironment.shutdown(BaseEnvironment.java:91)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionEnvironment.shutdown(RegionCoprocessorHost.java:165)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.shutdown(CoprocessorHost.java:290)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$4.postEnvCall(RegionCoprocessorHost.java:559)
>   at 
> org.apache.hadoop.hbase.coprocessor.CoprocessorHost.execOperation(CoprocessorHost.java:622)
>   at 
> org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.postClose(RegionCoprocessorHost.java:551)
>   at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1678)
>   at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1484)
>   at 
> org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
>   at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> {code}
> {{requestsMap}} is only initialized for the meta region.
> However, check for meta region is absent in the stop method:
> {code}
>   public void stop(CoprocessorEnvironment e) throws IOException {
> // since meta region can move around, clear stale metrics when stop.
> for (String meterName : requestsMap.keySet()) {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check when constructing the KeyValue

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724687#comment-16724687
 ] 

stack commented on HBASE-21401:
---

You have a point [~openinx] that doing checks inline will bulk up regular 
processing some. The basis though is that we are obscene when it comes to how 
many times we do offset and length calculations (We should fix this! Smile).

KeyValue does checks of lengths to make sure they are not insane (e.g. 
KeyValue#checkParameters). How much overlap between your check and these? Will 
we be doing double checking? Thanks.

> Sanity check when constructing the KeyValue
> ---
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch, 
> HBASE-21401.v3.patch, HBASE-21401.v4.patch, HBASE-21401.v4.patch, 
> HBASE-21401.v5.patch, HBASE-21401.v6.patch, HBASE-21401.v7.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-21379,  it's hard to debug this 
> kind of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21498) Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a new BlockCache

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724677#comment-16724677
 ] 

stack commented on HBASE-21498:
---

+1 for branch-2.0 please [~zghaobac] 

> Master OOM when SplitTableRegionProcedure new CacheConfig and instantiate a 
> new BlockCache
> --
>
> Key: HBASE-21498
> URL: https://issues.apache.org/jira/browse/HBASE-21498
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21498.master.001.patch, 
> HBASE-21498.master.002.patch, HBASE-21498.master.003.patch, 
> HBASE-21498.master.004.patch, HBASE-21498.master.005.patch, 
> HBASE-21498.master.006.patch, HBASE-21498.master.006.patch, 
> HBASE-21498.master.007.patch, HBASE-21498.master.007.patch
>
>
> In our cluster, we use a small heap/offheap config for master. After 
> HBASE-21290, master doesn't instantiate BlockCache when it not carry table. 
> But it will new CacheConfig in SplitTableRegionProcedure.splitStoreFiles 
> method. And it will instantiate a new BlockCache if it not initialized before 
> and make master OOM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724678#comment-16724678
 ] 

stack commented on HBASE-21514:
---

Thanks [~zghaobac] I put a +1 over on the HBASE-21498. Good find.

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724670#comment-16724670
 ] 

Guanghao Zhang commented on HBASE-21514:


{quote}There is risk with a patch of this size.
{quote}
Yes. A little big for branch-2.1 and branch-2.0. I thought it is ok to only 
merge this to master and branch-2. But there is another issue HBASE-21498 which 
i thought should be merged to branch-2.1 and branch-2.0. Please take a look 
about HBASE-21498 [~stack]

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21020) Determine WAL API changes for replication

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724660#comment-16724660
 ] 

Hadoop QA commented on HBASE-21020:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} HBASE-20952 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
54s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
27s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 8s{color} | {color:green} HBASE-20952 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  2m  
0s{color} | {color:blue} hbase-server in HBASE-20952 has 1 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
15s{color} | {color:green} HBASE-20952 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
25s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  5m 25s{color} 
| {color:red} root generated 1 new + 1148 unchanged - 1 fixed = 1149 total (was 
1149) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m  
2s{color} | {color:red} root: The patch generated 4 new + 191 unchanged - 13 
fixed = 195 total (was 204) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
50s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m  9s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Skipped patched modules with no Java source: . {color} 
|
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} hbase-server generated 0 new + 0 unchanged - 1 fixed 
= 0 total (was 1) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
28s{color} | {color:red} hbase-server generated 10 new + 2 unchanged - 0 fixed 
= 12 total (was 2) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  2m 
26s{color} | {color:red} root generated 10 new + 6 unchanged - 0 fixed = 16 
total (was 6) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}293m 42s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
23s{color} | {color:green} The 

[jira] [Commented] (HBASE-21617) HBase Bytes.putBigDecimal error

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724645#comment-16724645
 ] 

Zheng Hu commented on HBASE-21617:
--

Oh,  seems the putBigDecimal put the bytes of big decimal into another newly 
created bytes I think the useless putBigDecimal method can be removed now. 
Or fix this bug with a UT. 

> HBase Bytes.putBigDecimal error
> ---
>
> Key: HBASE-21617
> URL: https://issues.apache.org/jira/browse/HBASE-21617
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 2.1.0, 2.0.0, 2.1.1
> Environment: JDK 1.8
>Reporter: apcahephoenix
>Priority: Major
>
> *hbase-common/*
> *org.apache.hadoop.hbase.util.Bytes:*
> public static int putBigDecimal(byte[] bytes, int offset, BigDecimal val) {
>   if (bytes == null){
>     return offset;
>   }
>   byte[] valueBytes = val.unscaledValue().toByteArray();
>   byte[] result = new byte[valueBytes.length + SIZEOF_INT];
>   offset = putInt(result, offset, val.scale());
> {color:#d04437}return putBytes(result, offset, valueBytes, 0, 
> valueBytes.length); // this one, bytes is not used{color}
>  }
> *Test:*
>  byte[] bytes = new byte[64];
>  BigDecimal bigDecimal = new BigDecimal("100.10");
>  Bytes.putBigDecimal(bytes, 4, bigDecimal);
>  System.out.println(Arrays.toString(bytes)); // invalid
> *Suggest:*
>  public static int putBigDecimal(byte[] bytes, int offset, BigDecimal val) {
>   byte[] valueBytes = toBytes(val);
>   return putBytes(bytes, offset, valueBytes, 0, valueBytes.length);
>  }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21565:
-
Attachment: HBASE-21565.branch-2.002.patch

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.branch-2.001.patch, 
> HBASE-21565.branch-2.002.patch, HBASE-21565.master.001.patch, 
> HBASE-21565.master.002.patch, HBASE-21565.master.003.patch, 
> HBASE-21565.master.004.patch, HBASE-21565.master.005.patch, 
> HBASE-21565.master.006.patch, HBASE-21565.master.007.patch, 
> HBASE-21565.master.008.patch, HBASE-21565.master.009.patch, 
> HBASE-21565.master.010.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Jingyun Tian (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724646#comment-16724646
 ] 

Jingyun Tian commented on HBASE-21565:
--

These failed UTs are flakey, they all passed on my own desktop. I'll trigger 
the test again.
bq. +1 for branch-2.0 when it passes tests. 
Do you mean port this patch to branch-2.0? That may need take some extra effort 
since there are other patches may need to patch. Should I open a new issue for 
this? [~stack]

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.branch-2.001.patch, 
> HBASE-21565.branch-2.002.patch, HBASE-21565.master.001.patch, 
> HBASE-21565.master.002.patch, HBASE-21565.master.003.patch, 
> HBASE-21565.master.004.patch, HBASE-21565.master.005.patch, 
> HBASE-21565.master.006.patch, HBASE-21565.master.007.patch, 
> HBASE-21565.master.008.patch, HBASE-21565.master.009.patch, 
> HBASE-21565.master.010.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21617) HBase Bytes.putBigDecimal error

2018-12-18 Thread apcahephoenix (JIRA)
apcahephoenix created HBASE-21617:
-

 Summary: HBase Bytes.putBigDecimal error
 Key: HBASE-21617
 URL: https://issues.apache.org/jira/browse/HBASE-21617
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 2.1.1, 2.0.0, 2.1.0
 Environment: JDK 1.8
Reporter: apcahephoenix


*hbase-common/*

*org.apache.hadoop.hbase.util.Bytes:*

public static int putBigDecimal(byte[] bytes, int offset, BigDecimal val) {
  if (bytes == null){

    return offset;

  }

  byte[] valueBytes = val.unscaledValue().toByteArray();
  byte[] result = new byte[valueBytes.length + SIZEOF_INT];
  offset = putInt(result, offset, val.scale());
{color:#d04437}return putBytes(result, offset, valueBytes, 0, 
valueBytes.length); // this one, bytes is not used{color}
 }

*Test:*
 byte[] bytes = new byte[64];
 BigDecimal bigDecimal = new BigDecimal("100.10");
 Bytes.putBigDecimal(bytes, 4, bigDecimal);
 System.out.println(Arrays.toString(bytes)); // invalid

*Suggest:*
 public static int putBigDecimal(byte[] bytes, int offset, BigDecimal val) {
  byte[] valueBytes = toBytes(val);
  return putBytes(bytes, offset, valueBytes, 0, valueBytes.length);
 }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21563) HBase Get Encounters java.lang.IndexOutOfBoundsException

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724638#comment-16724638
 ] 

Zheng Hu commented on HBASE-21563:
--

I'm thinking about something wrong when the FastDiff compressing or 
uncompressing.  Seems not related to HBASE-21379, I'm checking the FastDiff 
code carefully...

> HBase Get Encounters java.lang.IndexOutOfBoundsException
> 
>
> Key: HBASE-21563
> URL: https://issues.apache.org/jira/browse/HBASE-21563
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 1.2.0
>Reporter: William Shen
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 67a04bc049be4f58afecdcc0a3ba62ca.tar.gz
>
>
> We've recently encountered issue retrieving data from our HBase cluster, and 
> have not had much luck troubleshooting the issue. We narrowed down our issue 
> to a single GET, which appears to be caused by FastDiffDeltaEncoder.java 
> running into java.lang.IndexOutOfBoundsException. 
> Perhaps there is a bug on a corner case for FastDiffDeltaEncoder? 
> We are running 1.2.0-cdh5.9.2, and the GET in question is:
> {noformat}
> hbase(main):004:0> get 'qa2.ADGROUPS', 
> "\x05\x80\x00\x00\x00\x00\x1F\x54\x9C\x80\x00\x00\x00\x00\x1C\x7D\x45\x00\x04\x80\x00\x00\x00\x00\x1D\x0F\x19\x80\x00\x00\x00\x00\x4A\x64\x6F\x80\x00\x00\x00\x01\xD9\xDB\xCE"
> COLUMNCELL
>   
>
> ERROR: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
> Caused by: java.lang.IndexOutOfBoundsException
> at java.nio.Buffer.checkBounds(Buffer.java:567)
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:149)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:465)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:516)
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:618)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.next(HFileReaderV2.java:1277)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:588)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5865)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5643)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5620)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5606)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6801)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6779)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2029)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
> ... 3 more {noformat}
> Likewise, running {{ hbase hfile -f -p }} on the specific hfile, a subset of 
> kv pairs were printed until the program hits the following exception and 
> crashes:
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: Unknown code 65
> at org.apache.hadoop.hbase.KeyValue$Type.codeToType(KeyValue.java:259)
> at org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:1246)
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$ClonedSeekerState.toString(BufferedDataBlockEncoder.java:506)
> at java.lang.String.valueOf(String.java:2994)
> at java.lang.StringBuilder.append(StringBuilder.java:131)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:382)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:316)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:255)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> 

[jira] [Created] (HBASE-21616) Port HBASE-21034 (Add new throttle type: read/write capacity unit) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-21616:
--

 Summary: Port HBASE-21034 (Add new throttle type: read/write 
capacity unit) to branch-1
 Key: HBASE-21616
 URL: https://issues.apache.org/jira/browse/HBASE-21616
 Project: HBase
  Issue Type: Task
Reporter: Andrew Purtell
 Fix For: 1.5.0


Port HBASE-21034 (Add new throttle type: read/write capacity unit) to branch-1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check when constructing the KeyValue

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724637#comment-16724637
 ] 

Zheng Hu commented on HBASE-21401:
--

bq.  I looked at the patch and I still see double-parse, no? (Once to check 
byte array contains a wholesome KV and then the usual parse that happens as 
part of KV usage?). Was thinking we could check wholesomeness inline with use?

Yes,  it's double-parse now, once to check the wholesome KV,  then parse the 
specific fields such as row/family/qualifler/ts/type and so on.  I did not move 
the check wholesomeness inline with use, because I found that in the upper 
layer,  the cell.getRowOffset() and cell.getRowLength() will be called many 
times.  take the scan processing as an example: 
step.1  load block from hfile, and let the cell to ref to the block; 
step.2  compare the row part with given startRow or stopRow in scan, call the 
cell.getRowOffset() and cell.getRowOffset();
step.3   Merge with other hfiles,  still need compare the row part . call the 
cell.getRowOffset() and cell.getRowOffset()  ; 
step.4   filters ... compare the row/family/qulifier/value. 
step.3   Merge with other stores,   compare the row part ... 

I mean the getRowOffset() and getRowOffset() (or 
getFamilyOffset/getFamilyLength() ... ) will be used in the uppler layer so 
many times.  If we move the row sanity check  in getRowOffset() and  
getRowOffset(),  move the family sanity check in getFamilyOffset() and 
getFamilyOffset  the sanity check will parse the relative fields so many 
times too ?  the cost even large than the double-check,  so i think the 
double-parse will be better in our case.

Please correct me if  I mis-understood something or missed something.

> Sanity check when constructing the KeyValue
> ---
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch, 
> HBASE-21401.v3.patch, HBASE-21401.v4.patch, HBASE-21401.v4.patch, 
> HBASE-21401.v5.patch, HBASE-21401.v6.patch, HBASE-21401.v7.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-21379,  it's hard to debug this 
> kind of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21615) Port HBASE-19994 (Create a new class for RPC throttling exception, make it retryable) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-21615.

Resolution: Fixed

Oops, here I am dup-ing my own previous work. 

> Port HBASE-19994 (Create a new class for RPC throttling exception, make it 
> retryable) to branch-1
> -
>
> Key: HBASE-21615
> URL: https://issues.apache.org/jira/browse/HBASE-21615
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
>
> Backport the change from HBASE-19994 to branch-1 but make the new behavior 
> configurable. Still changes the interfaces. Could be ok for a minor release 
> (1.5.0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21615) Port HBASE-19994 (Create a new class for RPC throttling exception, make it retryable) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21615:
---
Fix Version/s: (was: 1.5.0)

> Port HBASE-19994 (Create a new class for RPC throttling exception, make it 
> retryable) to branch-1
> -
>
> Key: HBASE-21615
> URL: https://issues.apache.org/jira/browse/HBASE-21615
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
>
> Backport the change from HBASE-19994 to branch-1 but make the new behavior 
> configurable. Still changes the interfaces. Could be ok for a minor release 
> (1.5.0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21615) Port HBASE-19994 (Create a new class for RPC throttling exception, make it retryable) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724631#comment-16724631
 ] 

Andrew Purtell commented on HBASE-21615:


The essence of the compatibility story
{code:java}
+    useRetryableThrottlingException = rsServices.getConfiguration()
+    .getBoolean(QuotaUtil.QUOTA_RETRYABLE_THROTTING_EXCEPTION_CONF_KEY,
+    QuotaUtil.QUOTA_RETRYABLE_THROTTING_EXCEPTION_DEFAULT);
{code}
Then
{code:java}
+  // Depending on whether we are supposed to throw a retryable IO exeption 
or not, choose
+  // the correct exception type to (re)throw
+  if (e instanceof ThrottlingException) {
+    if (useRetryableThrottlingException) {
+  throw new RpcThrottlingException(e.getMessage());
+    } else {
+  throw e;
+    }
+  } else if (e instanceof RpcThrottlingException) {
+    if (useRetryableThrottlingException) {
+  throw e;
+    } else {
+  throw new ThrottlingException(e.getMessage());
+    }
+  } else {
+    LOG.warn("Unexpected exception from quota check", e);
+    throw e;
+  }
{code}

> Port HBASE-19994 (Create a new class for RPC throttling exception, make it 
> retryable) to branch-1
> -
>
> Key: HBASE-21615
> URL: https://issues.apache.org/jira/browse/HBASE-21615
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
>
> Backport the change from HBASE-19994 to branch-1 but make the new behavior 
> configurable. Still changes the interfaces. Could be ok for a minor release 
> (1.5.0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21615) Port HBASE-19994 (Create a new class for RPC throttling exception, make it retryable) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21615:
---
Summary: Port HBASE-19994 (Create a new class for RPC throttling exception, 
make it retryable) to branch-1  (was: Port HBASE-19994 ((Create a new class for 
RPC throttling exception, make it retryable) to branch-1)

> Port HBASE-19994 (Create a new class for RPC throttling exception, make it 
> retryable) to branch-1
> -
>
> Key: HBASE-21615
> URL: https://issues.apache.org/jira/browse/HBASE-21615
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
>
> Backport the change from HBASE-19994 to branch-1 but make the new behavior 
> configurable. Still changes the interfaces. Could be ok for a minor release 
> (1.5.0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21615) Port HBASE-19994 ((Create a new class for RPC throttling exception, make it retryable) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21615:
---
Description: Backport the change from HBASE-19994 to branch-1 but make the 
new behavior configurable. Still changes the interfaces. Could be ok for a 
minor release (1.5.0)

> Port HBASE-19994 ((Create a new class for RPC throttling exception, make it 
> retryable) to branch-1
> --
>
> Key: HBASE-21615
> URL: https://issues.apache.org/jira/browse/HBASE-21615
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0
>
>
> Backport the change from HBASE-19994 to branch-1 but make the new behavior 
> configurable. Still changes the interfaces. Could be ok for a minor release 
> (1.5.0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21615) Port HBASE-19994 ((Create a new class for RPC throttling exception, make it retryable) to branch-1

2018-12-18 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-21615:
--

 Summary: Port HBASE-19994 ((Create a new class for RPC throttling 
exception, make it retryable) to branch-1
 Key: HBASE-21615
 URL: https://issues.apache.org/jira/browse/HBASE-21615
 Project: HBase
  Issue Type: Sub-task
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 1.5.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21492) CellCodec Written To WAL Before It's Verified

2018-12-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21492:
---
Fix Version/s: 1.4.10
   1.5.0

> CellCodec Written To WAL Before It's Verified
> -
>
> Key: HBASE-21492
> URL: https://issues.apache.org/jira/browse/HBASE-21492
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.2.7, 2.0.2
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 2.1.2, 2.0.4, 1.4.10
>
> Attachments: HBASE-21492-branch-1.patch, HBASE-21492.1.patch, 
> HBASE-21492.2.patch, HBASE-21492.2.patch
>
>
> The cell codec class name is written into the WAL file, but the cell codec 
> class is not actually verified to exist.  Therefore, users can inadvertently 
> configure an invalid class name and it will be recorded into the WAL file.  
> At that point, the WAL file becomes unreadable and blocks processing of all 
> other WAL files.
> {code:java|title=AbstractProtobufLogWriter.java}
>   private WALHeader buildWALHeader0(Configuration conf, WALHeader.Builder 
> builder) {
> if (!builder.hasWriterClsName()) {
>   builder.setWriterClsName(getWriterClassName());
> }
> if (!builder.hasCellCodecClsName()) {
>   builder.setCellCodecClsName(WALCellCodec.getWALCellCodecClass(conf));
> }
> return builder.build();
>   }
> {code}
> https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractProtobufLogWriter.java#L78-L86



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21492) CellCodec Written To WAL Before It's Verified

2018-12-18 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21492:
---
Attachment: HBASE-21492-branch-1.patch

> CellCodec Written To WAL Before It's Verified
> -
>
> Key: HBASE-21492
> URL: https://issues.apache.org/jira/browse/HBASE-21492
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.2.7, 2.0.2
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Critical
> Fix For: 3.0.0, 1.5.0, 2.2.0, 2.1.2, 2.0.4, 1.4.10
>
> Attachments: HBASE-21492-branch-1.patch, HBASE-21492.1.patch, 
> HBASE-21492.2.patch, HBASE-21492.2.patch
>
>
> The cell codec class name is written into the WAL file, but the cell codec 
> class is not actually verified to exist.  Therefore, users can inadvertently 
> configure an invalid class name and it will be recorded into the WAL file.  
> At that point, the WAL file becomes unreadable and blocks processing of all 
> other WAL files.
> {code:java|title=AbstractProtobufLogWriter.java}
>   private WALHeader buildWALHeader0(Configuration conf, WALHeader.Builder 
> builder) {
> if (!builder.hasWriterClsName()) {
>   builder.setWriterClsName(getWriterClassName());
> }
> if (!builder.hasCellCodecClsName()) {
>   builder.setCellCodecClsName(WALCellCodec.getWALCellCodecClass(conf));
> }
> return builder.build();
>   }
> {code}
> https://github.com/apache/hbase/blob/025ddce868eb06b4072b5152c5ffae5a01e7ae30/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractProtobufLogWriter.java#L78-L86



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20984) Add/Modify test case to check custom hbase.wal.dir outside hdfs filesystem

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724594#comment-16724594
 ] 

Hadoop QA commented on HBASE-20984:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
48s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 30s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}276m  7s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}313m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.TestAssignmentManagerMetrics |
|   | hadoop.hbase.client.TestFromClientSide3 |
|   | hadoop.hbase.regionserver.TestRegionReplicaFailover |
|   | hadoop.hbase.client.TestSnapshotDFSTemporaryDirectory |
|   | hadoop.hbase.regionserver.TestSplitTransactionOnCluster |
|   | hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas |
|   | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20984 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952081/hbase-20984.master.003.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux dcacffd79718 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c448604ceb |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| 

[jira] [Commented] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724572#comment-16724572
 ] 

Hudson commented on HBASE-21592:


Results for branch branch-2
[build #1565 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1565/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1565//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1565//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1565//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 3. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1565//artifact/output-integration/hadoop-3.log].
 (note that this means we didn't check the Hadoop 3 shaded client)


> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21614) RIT recovery with ServerCrashProcedure is broken in multiple ways

2018-12-18 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21614:
-
Description: 
Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #0.5 - confirm 
where? But that should be covered by meta, so not a big deal, right. As such it 
doesn't seem to add the region to server map anywhere
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #1 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with region 
transition at startup, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'.
2018-12-17 14:51:20,770 INFO  [master/:17000:becomeActiveMaster] 
master.ServerManager: Processing expiration of server1,17020,1544636616174 on 
,17000,1545087053243
2018-12-17 14:51:20,921 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Added server1,17020,1544636616174 to dead servers 
which carryingMeta=false, submitted ServerCrashProcedure pid=111298
2018-12-17 14:51:30,728 INFO  [PEWorker-13] procedure.ServerCrashProcedure: 
Start pid=111298, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; 
ServerCrashProcedure server=server1,17020,1544636616174, splitWal=true, 
meta=false
{noformat}
Meta is only loaded 11-12 seconds later.
If one looks at meta-loading code however, there is one more problem #2 - the 
region is in CLOSING state, so the {{addRegionToServer}} is not going to be 
called - it's only called for OPENED regions. 
Expanding on the above, I've only seen SCP unblock stuck region transition at 
startup when region started out in meta as OPEN.
{noformat}
2018-12-17 14:51:42,403 INFO  [master/:17000:becomeActiveMaster] 
assignment.RegionStateStore: Load hbase:meta entry region=region1, 
regionState=CLOSING, lastHost=server1,17020,1544636616174, 
regionLocation=server1,17020,1544636616174, openSeqNum=629131
{noformat}
SCP predictably finishes without doing anything; no other logs for this pid
{noformat}
2018-12-17 14:52:19,046 INFO  [PEWorker-2] procedure2.ProcedureExecutor: 
Finished pid=111298, state=SUCCESS, hasLock=false; ServerCrashProcedure 
server=server1,17020,1544636616174, splitWal=true, meta=false in 58.0010sec
{noformat}
After that, region is still stuck trying to be closed in 
TransitRegionStateProcedure; it's in the same state for hours including across 
master restarts.
{noformat}
2018-12-17 15:09:35,216 WARN  [PEWorker-14] 
assignment.TransitRegionStateProcedure: Failed transition, suspend 604secs 
pid=38015, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, hasLock=true; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE; rit=CLOSING, 
location=server1,17020,1544636616174; waiting on rectified condition fixed by 
other Procedure or operator intervention
{noformat}



  was:
Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #0.5 - confirm 
where? But that should be covered by meta, so not a big deal, right.
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #1 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with region 
transition at startup, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' 

[jira] [Updated] (HBASE-21614) RIT recovery with ServerCrashProcedure is broken in multiple ways

2018-12-18 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21614:
-
Description: 
Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #0.5 - confirm 
where? But that should be covered by meta, so not a big deal, right.
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #1 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with region 
transition at startup, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'.
2018-12-17 14:51:20,770 INFO  [master/:17000:becomeActiveMaster] 
master.ServerManager: Processing expiration of server1,17020,1544636616174 on 
,17000,1545087053243
2018-12-17 14:51:20,921 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Added server1,17020,1544636616174 to dead servers 
which carryingMeta=false, submitted ServerCrashProcedure pid=111298
2018-12-17 14:51:30,728 INFO  [PEWorker-13] procedure.ServerCrashProcedure: 
Start pid=111298, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; 
ServerCrashProcedure server=server1,17020,1544636616174, splitWal=true, 
meta=false
{noformat}
Meta is only loaded 11-12 seconds later.
If one looks at meta-loading code however, there is one more problem #2 - the 
region is in CLOSING state, so the {{addRegionToServer}} is not going to be 
called - it's only called for OPENED regions. 
Expanding on the above, I've only seen SCP unblock stuck region transition at 
startup when region started out in meta as OPEN.
{noformat}
2018-12-17 14:51:42,403 INFO  [master/:17000:becomeActiveMaster] 
assignment.RegionStateStore: Load hbase:meta entry region=region1, 
regionState=CLOSING, lastHost=server1,17020,1544636616174, 
regionLocation=server1,17020,1544636616174, openSeqNum=629131
{noformat}
SCP predictably finishes without doing anything; no other logs for this pid
{noformat}
2018-12-17 14:52:19,046 INFO  [PEWorker-2] procedure2.ProcedureExecutor: 
Finished pid=111298, state=SUCCESS, hasLock=false; ServerCrashProcedure 
server=server1,17020,1544636616174, splitWal=true, meta=false in 58.0010sec
{noformat}
After that, region is still stuck trying to be closed in 
TransitRegionStateProcedure; it's in the same state for hours including across 
master restarts.
{noformat}
2018-12-17 15:09:35,216 WARN  [PEWorker-14] 
assignment.TransitRegionStateProcedure: Failed transition, suspend 604secs 
pid=38015, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, hasLock=true; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE; rit=CLOSING, 
location=server1,17020,1544636616174; waiting on rectified condition fixed by 
other Procedure or operator intervention
{noformat}



  was:
Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #1 - confirm where? 
But that should be covered by meta, so not a big deal, right.
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #2 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with region 
transition at startup, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'.
2018-12-17 14:51:20,770 INFO  

[jira] [Updated] (HBASE-21614) RIT recovery with ServerCrashProcedure is broken in multiple ways

2018-12-18 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21614:
-
Description: 
Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #1 - confirm where? 
But that should be covered by meta, so not a big deal, right.
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #2 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with region 
transition at startup, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'.
2018-12-17 14:51:20,770 INFO  [master/:17000:becomeActiveMaster] 
master.ServerManager: Processing expiration of server1,17020,1544636616174 on 
,17000,1545087053243
2018-12-17 14:51:20,921 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Added server1,17020,1544636616174 to dead servers 
which carryingMeta=false, submitted ServerCrashProcedure pid=111298
2018-12-17 14:51:30,728 INFO  [PEWorker-13] procedure.ServerCrashProcedure: 
Start pid=111298, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; 
ServerCrashProcedure server=server1,17020,1544636616174, splitWal=true, 
meta=false
{noformat}
Meta is only loaded 11-12 seconds later.
If one looks at meta-loading code however, there is one more problem - the 
region is in CLOSING state, so the {{addRegionToServer}} is not going to be 
called - it's only called for OPENED regions. 
Expanding on the above, I've only seen SCP unblock stuck region transition at 
startup when region started out in meta as OPEN.
{noformat}
2018-12-17 14:51:42,403 INFO  [master/:17000:becomeActiveMaster] 
assignment.RegionStateStore: Load hbase:meta entry region=region1, 
regionState=CLOSING, lastHost=server1,17020,1544636616174, 
regionLocation=server1,17020,1544636616174, openSeqNum=629131
{noformat}
SCP predictably finishes without doing anything; no other logs for this pid
{noformat}
2018-12-17 14:52:19,046 INFO  [PEWorker-2] procedure2.ProcedureExecutor: 
Finished pid=111298, state=SUCCESS, hasLock=false; ServerCrashProcedure 
server=server1,17020,1544636616174, splitWal=true, meta=false in 58.0010sec
{noformat}
After that, region is still stuck trying to be closed in 
TransitRegionStateProcedure; it's in the same state for hours including across 
master restarts.
{noformat}
2018-12-17 15:09:35,216 WARN  [PEWorker-14] 
assignment.TransitRegionStateProcedure: Failed transition, suspend 604secs 
pid=38015, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, hasLock=true; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE; rit=CLOSING, 
location=server1,17020,1544636616174; waiting on rectified condition fixed by 
other Procedure or operator intervention
{noformat}



  was:
Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #1 - confirm where? 
But that should be covered by meta, so not a big deal, right.
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #2 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with 
TRANSITION_CONFIRM_CLOSED, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'.
2018-12-17 14:51:20,770 INFO  

[jira] [Created] (HBASE-21614) RIT recovery with ServerCrashProcedure is broken in multiple ways

2018-12-18 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HBASE-21614:


 Summary: RIT recovery with ServerCrashProcedure is broken in 
multiple ways
 Key: HBASE-21614
 URL: https://issues.apache.org/jira/browse/HBASE-21614
 Project: HBase
  Issue Type: Bug
Reporter: Sergey Shelukhin


Master is restarting after a previous master crashed while recovering some 
regions from a dead server.

Master recovers RIT for the region, however the RIT has no location (logged, at 
least) in CONFIRM_CLOSE state. That is a potential problem #1 - confirm where? 
But that should be covered by meta, so not a big deal, right.
{noformat}
2018-12-17 14:51:14,606 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Attach pid=38015, 
state=RUNNABLE:REGION_STATE_TRANSITION_CONFIRM_CLOSED, hasLock=false; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE to 
rit=OFFLINE, location=null, table=t1, region=region1 to restore RIT
{noformat}

However, in this case ServerCrashProcedure for the server kicks off BEFORE meta 
is loaded.
That seems to be a problem #2 - it immediately gets regions to later recover, 
so in this case it gets nothing.
I've grepped our logs for successful cases of SCP interacting with 
TRANSITION_CONFIRM_CLOSED, and in all cases the meta was loaded before SCP.
Seems like a race condition.
{noformat}
2018-12-17 14:51:14,625 INFO  [master/:17000:becomeActiveMaster] 
master.RegionServerTracker: Starting RegionServerTracker; 0 have existing 
ServerCrashProcedures, 103 possibly 'live' servers, and 1 'splitting'.
2018-12-17 14:51:20,770 INFO  [master/:17000:becomeActiveMaster] 
master.ServerManager: Processing expiration of server1,17020,1544636616174 on 
,17000,1545087053243
2018-12-17 14:51:20,921 INFO  [master/:17000:becomeActiveMaster] 
assignment.AssignmentManager: Added server1,17020,1544636616174 to dead servers 
which carryingMeta=false, submitted ServerCrashProcedure pid=111298
2018-12-17 14:51:30,728 INFO  [PEWorker-13] procedure.ServerCrashProcedure: 
Start pid=111298, state=RUNNABLE:SERVER_CRASH_START, hasLock=true; 
ServerCrashProcedure server=server1,17020,1544636616174, splitWal=true, 
meta=false
{noformat}
Meta is only loaded 11-12 seconds later.
If one looks at meta-loading code however, there is one more problem - the 
region is in CLOSING state, so the {{addRegionToServer}} is not going to be 
called - it's only called for OPENED regions. 
Expanding on the above, I've only seen SCP unblock stuck 
TRANSITION_CONFIRM_CLOSED when region started out in meta as OPEN.
{noformat}
2018-12-17 14:51:42,403 INFO  [master/:17000:becomeActiveMaster] 
assignment.RegionStateStore: Load hbase:meta entry region=region1, 
regionState=CLOSING, lastHost=server1,17020,1544636616174, 
regionLocation=server1,17020,1544636616174, openSeqNum=629131
{noformat}
SCP predictably finishes without doing anything; no other logs for this pid
{noformat}
2018-12-17 14:52:19,046 INFO  [PEWorker-2] procedure2.ProcedureExecutor: 
Finished pid=111298, state=SUCCESS, hasLock=false; ServerCrashProcedure 
server=server1,17020,1544636616174, splitWal=true, meta=false in 58.0010sec
{noformat}
After that, region is still stuck trying to be closed in 
TransitRegionStateProcedure; it's in the same state for hours including across 
master restarts.
{noformat}
2018-12-17 15:09:35,216 WARN  [PEWorker-14] 
assignment.TransitRegionStateProcedure: Failed transition, suspend 604secs 
pid=38015, state=RUNNABLE:REGION_STATE_TRANSITION_CLOSE, hasLock=true; 
TransitRegionStateProcedure table=t1, region=region1, REOPEN/MOVE; rit=CLOSING, 
location=server1,17020,1544636616174; waiting on rectified condition fixed by 
other Procedure or operator intervention
{noformat}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21020) Determine WAL API changes for replication

2018-12-18 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-21020:
--
Attachment: HBASE-21020.HBASE-20952.003.patch

> Determine WAL API changes for replication
> -
>
> Key: HBASE-21020
> URL: https://issues.apache.org/jira/browse/HBASE-21020
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: HBASE-21020.HBASE-20952.001.patch, 
> HBASE-21020.HBASE-20952.002.patch, HBASE-21020.HBASE-20952.003.patch
>
>
> Spin-off of HBASE-20952.
> Ankit has started working on what he thinks a WAL API specifically for 
> Replication should look like. In his own words:
> {quote}
> At a high level, it looks,
>  * Need to abstract WAL name under WalInfo instead of Paths
>  * Abstract the WalEntryStream for FileSystem and Streaming system.
>  * Build WalStorage APIs to abstract operation on Wal.
>  * Provide the implementation of all above through corresponding WalProvider
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21020) Determine WAL API changes for replication

2018-12-18 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-21020:
--
Attachment: (was: HBASE-21246.master.003.patch)

> Determine WAL API changes for replication
> -
>
> Key: HBASE-21020
> URL: https://issues.apache.org/jira/browse/HBASE-21020
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: HBASE-21020.HBASE-20952.001.patch, 
> HBASE-21020.HBASE-20952.002.patch, HBASE-21020.HBASE-20952.003.patch
>
>
> Spin-off of HBASE-20952.
> Ankit has started working on what he thinks a WAL API specifically for 
> Replication should look like. In his own words:
> {quote}
> At a high level, it looks,
>  * Need to abstract WAL name under WalInfo instead of Paths
>  * Abstract the WalEntryStream for FileSystem and Streaming system.
>  * Build WalStorage APIs to abstract operation on Wal.
>  * Provide the implementation of all above through corresponding WalProvider
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-14939) Document bulk loaded hfile replication

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724483#comment-16724483
 ] 

Hadoop QA commented on HBASE-14939:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
22s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  4m 
58s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  4m 
54s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-14939 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952244/HBASE-14939.master.001.patch
 |
| Optional Tests |  dupname  asflicense  refguide  |
| uname | Linux 70732a00d2d1 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c448604ceb |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321/artifact/patchprocess/branch-site/book.html
 |
| refguide | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321/artifact/patchprocess/patch-site/book.html
 |
| Max. process+thread count | 97 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15321/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Document bulk loaded hfile replication
> --
>
> Key: HBASE-14939
> URL: https://issues.apache.org/jira/browse/HBASE-14939
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Ashish Singhi
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-14939.master.001.patch
>
>
> After HBASE-13153 is committed we need to add that information under the 
> Cluster Replication section in HBase book.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21610) numOpenConnections metric is set to -1 when zero server channel exist

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724480#comment-16724480
 ] 

Hadoop QA commented on HBASE-21610:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 8s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}140m 11s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}180m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRecoveredEdits |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21610 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952230/HBASE-21610.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux de9be0cc24c4 4.4.0-139-generic #165~14.04.1-Ubuntu SMP Wed Oct 
31 10:55:11 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c448604ceb |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15317/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 

[jira] [Commented] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724478#comment-16724478
 ] 

Hudson commented on HBASE-21592:


Results for branch branch-1.2
[build #593 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/593/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/593//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/593//JDK7_Nightly_Build_Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/593//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-14939) Document bulk loaded hfile replication

2018-12-18 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-14939:

Status: Patch Available  (was: Open)

> Document bulk loaded hfile replication
> --
>
> Key: HBASE-14939
> URL: https://issues.apache.org/jira/browse/HBASE-14939
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Ashish Singhi
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-14939.master.001.patch
>
>
> After HBASE-13153 is committed we need to add that information under the 
> Cluster Replication section in HBase book.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-14939) Document bulk loaded hfile replication

2018-12-18 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HBASE-14939:

Attachment: HBASE-14939.master.001.patch

> Document bulk loaded hfile replication
> --
>
> Key: HBASE-14939
> URL: https://issues.apache.org/jira/browse/HBASE-14939
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Ashish Singhi
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-14939.master.001.patch
>
>
> After HBASE-13153 is committed we need to add that information under the 
> Cluster Replication section in HBase book.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-14939) Document bulk loaded hfile replication

2018-12-18 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724475#comment-16724475
 ] 

Wei-Chiu Chuang commented on HBASE-14939:
-

h3. 73.5. Bulk Loading Replication
HBASE-13153 adds replication support for bulk loaded HFiles, available since 
HBase 1.3/2.0. This feature is enabled by setting 
{{hbase.replication.bulkload.enabled}} to {{true}} (default is {{false}}). You 
also need to copy the source cluster configuration files to the destination 
cluster.
Additional configurations are required too:
 # {{hbase.replication.source.fs.conf.provider}}
This defines the class which loads the source cluster file system client 
configuration in the destination cluster. This should be configured for all the 
RS in the destination cluster. Default is 
{{org.apache.hadoop.hbase.replication.regionserver.DefaultSourceFSConfigurationProvider}}.

 # {{hbase.replication.conf.dir}}
This represents the base directory where the file system client configurations 
of the source cluster are copied to the destination cluster. This should be 
configured for all the RS in the destination cluster. Default is 
{{$HBASE_CONF_DIR}}.

 # {{hbase.replication.cluster.id}}
This configuration is required in the cluster where replication for bulk loaded 
data is enabled. A source cluster is uniquely identified by the destination 
cluster using this id. This should be configured for all the RS in the source 
cluster configuration file for all the RS.
For example: If source cluster FS client configurations are copied to the 
destination cluster under directory {{/home/user/dc1/}}, then 
{{hbase.replication.cluster.id}} should be configured as {{dc1}} and 
{{hbase.replication.conf.dir}} as {{/home/user}}.
| |{{DefaultSourceFSConfigurationProvider}} supports only {{xml}} type files. 
It loads source cluster FS client configuration only once, so if source cluster 
FS client configuration files are updated, every peer(s) cluster RS must be 
restarted to reload the configuration.|

> Document bulk loaded hfile replication
> --
>
> Key: HBASE-14939
> URL: https://issues.apache.org/jira/browse/HBASE-14939
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Ashish Singhi
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HBASE-14939.master.001.patch
>
>
> After HBASE-13153 is committed we need to add that information under the 
> Cluster Replication section in HBase book.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21020) Determine WAL API changes for replication

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724433#comment-16724433
 ] 

Hadoop QA commented on HBASE-21020:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-21020 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.8.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-21020 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952239/HBASE-21246.master.003.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15320/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Determine WAL API changes for replication
> -
>
> Key: HBASE-21020
> URL: https://issues.apache.org/jira/browse/HBASE-21020
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: HBASE-21020.HBASE-20952.001.patch, 
> HBASE-21020.HBASE-20952.002.patch, HBASE-21246.master.003.patch
>
>
> Spin-off of HBASE-20952.
> Ankit has started working on what he thinks a WAL API specifically for 
> Replication should look like. In his own words:
> {quote}
> At a high level, it looks,
>  * Need to abstract WAL name under WalInfo instead of Paths
>  * Abstract the WalEntryStream for FileSystem and Streaming system.
>  * Build WalStorage APIs to abstract operation on Wal.
>  * Provide the implementation of all above through corresponding WalProvider
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724427#comment-16724427
 ] 

Hudson commented on HBASE-21592:


Results for branch branch-1.4
[build #595 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/595/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/595//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/595//JDK7_Nightly_Build_Report/]


(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/595//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21020) Determine WAL API changes for replication

2018-12-18 Thread Ankit Singhal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankit Singhal updated HBASE-21020:
--
Attachment: HBASE-21246.master.003.patch

> Determine WAL API changes for replication
> -
>
> Key: HBASE-21020
> URL: https://issues.apache.org/jira/browse/HBASE-21020
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Reporter: Josh Elser
>Assignee: Ankit Singhal
>Priority: Major
> Fix For: HBASE-20952
>
> Attachments: HBASE-21020.HBASE-20952.001.patch, 
> HBASE-21020.HBASE-20952.002.patch, HBASE-21246.master.003.patch
>
>
> Spin-off of HBASE-20952.
> Ankit has started working on what he thinks a WAL API specifically for 
> Replication should look like. In his own words:
> {quote}
> At a high level, it looks,
>  * Need to abstract WAL name under WalInfo instead of Paths
>  * Abstract the WalEntryStream for FileSystem and Streaming system.
>  * Build WalStorage APIs to abstract operation on Wal.
>  * Provide the implementation of all above through corresponding WalProvider
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20984) Add/Modify test case to check custom hbase.wal.dir outside hdfs filesystem

2018-12-18 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724420#comment-16724420
 ] 

Sean Busbey commented on HBASE-20984:
-

let me try rerunning the precommit bot. maybe we've hit a hiccup in our flaky 
list.

> Add/Modify test case to check custom hbase.wal.dir outside hdfs filesystem
> --
>
> Key: HBASE-20984
> URL: https://issues.apache.org/jira/browse/HBASE-20984
> Project: HBase
>  Issue Type: Bug
>  Components: test, wal
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Minor
> Attachments: hbase-20984.master.001.patch, 
> hbase-20984.master.002.patch, hbase-20984.master.003.patch
>
>
> The current setup in TestWALFactory tries to create custom WAL directory 
> outside hdfs but ends up creating a custom WAL directory inside hdfs. In 
> TestWALFactory.java:
> {code:java}
> public static void setUpBeforeClass() throws Exception {
> CommonFSUtils.setWALRootDir(TEST_UTIL.getConfiguration(), new 
> Path("file:///tmp/wal")); // A local filesystem WAL is attempted
> ...
> hbaseDir = TEST_UTIL.createRootDir();
> hbaseWALDir = TEST_UTIL.createWALRootDir(); // But a directory inside 
> hdfs is created here using HBaseTestingUtility#getNewDataTestDirOnTestFS
> }
> {code}
> The change was made in HBASE-20723



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-18 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724297#comment-16724297
 ] 

Sergey Shelukhin edited comment on HBASE-21564 at 12/18/18 6:45 PM:


[~Apache9] does this patch makes sense to you after the changes? I can commit 
if I have a +1 :)


was (Author: sershe):
[~Apache9] does this patch makes sense to you after the changes?

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch, 
> HBASE-21564.master.004.patch, HBASE-21564.master.005.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21612) Add developer debug options in HBase Config for REST server

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724360#comment-16724360
 ] 

Hadoop QA commented on HBASE-21612:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue}  0m  
1s{color} | {color:blue} Shelldocs was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
 1s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  0m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21612 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952233/HBASE-21612.patch |
| Optional Tests |  dupname  asflicense  shellcheck  shelldocs  |
| uname | Linux 147c17f96ced 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c448604ceb |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| shellcheck | v0.4.4 |
| Max. process+thread count | 47 (vs. ulimit of 1) |
| modules | C: . U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15318/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Add developer debug options in  HBase Config for REST server
> 
>
> Key: HBASE-21612
> URL: https://issues.apache.org/jira/browse/HBASE-21612
> Project: HBase
>  Issue Type: Wish
>  Components: REST
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21612.patch
>
>
> Add developer debug options in  HBase Config for REST server.
> Currently we have,
> {noformat}
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21612) Add developer debug options in HBase Config for REST server

2018-12-18 Thread Pankaj Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-21612:
-
Fix Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Add developer debug options in  HBase Config for REST server
> 
>
> Key: HBASE-21612
> URL: https://issues.apache.org/jira/browse/HBASE-21612
> Project: HBase
>  Issue Type: Wish
>  Components: REST
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21612.patch
>
>
> Add developer debug options in  HBase Config for REST server.
> Currently we have,
> {noformat}
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21612) Add developer debug options in HBase Config for REST server

2018-12-18 Thread Pankaj Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-21612:
-
Attachment: HBASE-21612.patch

> Add developer debug options in  HBase Config for REST server
> 
>
> Key: HBASE-21612
> URL: https://issues.apache.org/jira/browse/HBASE-21612
> Project: HBase
>  Issue Type: Wish
>  Components: REST
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Attachments: HBASE-21612.patch
>
>
> Add developer debug options in  HBase Config for REST server.
> Currently we have,
> {noformat}
> # export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"
> # export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"
> # export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"
> # export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug 
> -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21595) Print thread's information and stack traces when RS is aborting forcibly

2018-12-18 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724341#comment-16724341
 ] 

Pankaj Kumar commented on HBASE-21595:
--

Test case failure is not relevant, passing locally.

> Print thread's information and stack traces when RS is aborting forcibly
> 
>
> Key: HBASE-21595
> URL: https://issues.apache.org/jira/browse/HBASE-21595
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver
>Affects Versions: 3.0.0, 2.2.0
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21595.patch
>
>
> After HBASE-21325 RS terminate forcibly  on abort timeout.
> We should print the thread info before terminating, will be useful to analyze 
> the RS abort timeout problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21535) Zombie Master detector is not working

2018-12-18 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724329#comment-16724329
 ] 

Pankaj Kumar commented on HBASE-21535:
--

TestLockManager is passing locally.

> Zombie Master detector is not working
> -
>
> Key: HBASE-21535
> URL: https://issues.apache.org/jira/browse/HBASE-21535
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Critical
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21535.branch-2.patch, HBASE-21535.branch-2.patch, 
> HBASE-21535.patch, HBASE-21535.v2.patch
>
>
> We have InitializationMonitor thread in HMaster which detects Zombie Hmaster 
> based on _hbase.master.initializationmonitor.timeout _and halts if 
> _hbase.master.initializationmonitor.haltontimeout_ set _true_.
> After HBASE-19694, HMaster initialization order was correted. Hmaster is set 
> active after Initializing ZK system trackers as follows,
> {noformat}
>  status.setStatus("Initializing ZK system trackers");
>  initializeZKBasedSystemTrackers();
>  status.setStatus("Loading last flushed sequence id of regions");
>  try {
>  this.serverManager.loadLastFlushedSequenceIds();
>  } catch (IOException e) {
>  LOG.debug("Failed to load last flushed sequence id of regions"
>  + " from file system", e);
>  }
>  // Set ourselves as active Master now our claim has succeeded up in zk.
>  this.activeMaster = true;
> {noformat}
> But Zombie detector thread is started at the begining phase of 
> finishActiveMasterInitialization(),
> {noformat}
>  private void finishActiveMasterInitialization(MonitoredTask status) throws 
> IOException,
>  InterruptedException, KeeperException, ReplicationException {
>  Thread zombieDetector = new Thread(new InitializationMonitor(this),
>  "ActiveMasterInitializationMonitor-" + System.currentTimeMillis());
>  zombieDetector.setDaemon(true);
>  zombieDetector.start();
> {noformat}
> During zombieDetector execution "master.isActiveMaster()" will be false, so 
> it won't wait and cant detect zombie master.
> {noformat}
>  @Override
>  public void run() {
>  try {
>  while (!master.isStopped() && master.isActiveMaster()) {
>  Thread.sleep(timeout);
>  if (master.isInitialized()) {
>  LOG.debug("Initialization completed within allotted tolerance. Monitor 
> exiting.");
>  } else {
>  LOG.error("Master failed to complete initialization after " + timeout + "ms. 
> Please"
>  + " consider submitting a bug report including a thread dump of this 
> process.");
>  if (haltOnTimeout) {
>  LOG.error("Zombie Master exiting. Thread dump to stdout");
>  Threads.printThreadInfo(System.out, "Zombie HMaster");
>  System.exit(-1);
>  }
>  }
>  }
>  } catch (InterruptedException ie) {
>  LOG.trace("InitMonitor thread interrupted. Existing.");
>  }
>  }
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21610) numOpenConnections metric is set to -1 when zero server channel exist

2018-12-18 Thread Pankaj Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724326#comment-16724326
 ] 

Pankaj Kumar commented on HBASE-21610:
--

Thanks [~brfrn169] for reviewing the patch. Failures are not relevant, these 
are passing locally.
Reattached same patch for QA run.

> numOpenConnections metric is set to -1 when zero server channel exist
> -
>
> Key: HBASE-21610
> URL: https://issues.apache.org/jira/browse/HBASE-21610
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21610.patch, HBASE-21610.patch
>
>
> In NettyRpcServer, numOpenConnections metric is set to -1 when zero server 
> channel exist.
> {code}
> @Override
>  public int getNumOpenConnections() {
>  // allChannels also contains the server channel, so exclude that from the 
> count.
>  return allChannels.size() - 1;
>  }
> {code}
>  
>  We should not decrease the channel size by 1 when zero server channel exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21610) numOpenConnections metric is set to -1 when zero server channel exist

2018-12-18 Thread Pankaj Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pankaj Kumar updated HBASE-21610:
-
Attachment: HBASE-21610.patch

> numOpenConnections metric is set to -1 when zero server channel exist
> -
>
> Key: HBASE-21610
> URL: https://issues.apache.org/jira/browse/HBASE-21610
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 2.1.1, 2.0.3
>Reporter: Pankaj Kumar
>Assignee: Pankaj Kumar
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: HBASE-21610.patch, HBASE-21610.patch
>
>
> In NettyRpcServer, numOpenConnections metric is set to -1 when zero server 
> channel exist.
> {code}
> @Override
>  public int getNumOpenConnections() {
>  // allChannels also contains the server channel, so exclude that from the 
> count.
>  return allChannels.size() - 1;
>  }
> {code}
>  
>  We should not decrease the channel size by 1 when zero server channel exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21563) HBase Get Encounters java.lang.IndexOutOfBoundsException

2018-12-18 Thread William Shen (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724306#comment-16724306
 ] 

William Shen commented on HBASE-21563:
--

Thanks for looking into this [~openinx]. Please let me know if I can help in 
any way.

> HBase Get Encounters java.lang.IndexOutOfBoundsException
> 
>
> Key: HBASE-21563
> URL: https://issues.apache.org/jira/browse/HBASE-21563
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 1.2.0
>Reporter: William Shen
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 67a04bc049be4f58afecdcc0a3ba62ca.tar.gz
>
>
> We've recently encountered issue retrieving data from our HBase cluster, and 
> have not had much luck troubleshooting the issue. We narrowed down our issue 
> to a single GET, which appears to be caused by FastDiffDeltaEncoder.java 
> running into java.lang.IndexOutOfBoundsException. 
> Perhaps there is a bug on a corner case for FastDiffDeltaEncoder? 
> We are running 1.2.0-cdh5.9.2, and the GET in question is:
> {noformat}
> hbase(main):004:0> get 'qa2.ADGROUPS', 
> "\x05\x80\x00\x00\x00\x00\x1F\x54\x9C\x80\x00\x00\x00\x00\x1C\x7D\x45\x00\x04\x80\x00\x00\x00\x00\x1D\x0F\x19\x80\x00\x00\x00\x00\x4A\x64\x6F\x80\x00\x00\x00\x01\xD9\xDB\xCE"
> COLUMNCELL
>   
>
> ERROR: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
> Caused by: java.lang.IndexOutOfBoundsException
> at java.nio.Buffer.checkBounds(Buffer.java:567)
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:149)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:465)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:516)
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:618)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.next(HFileReaderV2.java:1277)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:588)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5865)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5643)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5620)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5606)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6801)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6779)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2029)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
> ... 3 more {noformat}
> Likewise, running {{ hbase hfile -f -p }} on the specific hfile, a subset of 
> kv pairs were printed until the program hits the following exception and 
> crashes:
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: Unknown code 65
> at org.apache.hadoop.hbase.KeyValue$Type.codeToType(KeyValue.java:259)
> at org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:1246)
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$ClonedSeekerState.toString(BufferedDataBlockEncoder.java:506)
> at java.lang.String.valueOf(String.java:2994)
> at java.lang.StringBuilder.append(StringBuilder.java:131)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:382)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:316)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:255)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> 

[jira] [Commented] (HBASE-21564) race condition in WAL rolling resulting in size-based rolling getting stuck

2018-12-18 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724297#comment-16724297
 ] 

Sergey Shelukhin commented on HBASE-21564:
--

[~Apache9] does this patch makes sense to you after the changes?

> race condition in WAL rolling resulting in size-based rolling getting stuck
> ---
>
> Key: HBASE-21564
> URL: https://issues.apache.org/jira/browse/HBASE-21564
> Project: HBase
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: HBASE-21564.master.001.patch, 
> HBASE-21564.master.002.patch, HBASE-21564.master.003.patch, 
> HBASE-21564.master.004.patch, HBASE-21564.master.005.patch
>
>
> Manifests at least with AsyncFsWriter.
> There's a window after LogRoller replaces the writer in the WAL, but before 
> it sets the rollLog boolean to false in the finally, where the WAL class can 
> request another log roll (it can happen in particular when the logs are 
> getting archived in the LogRoller thread, and there's high write volume 
> causing the logs to roll quickly).
> LogRoller will blindly reset the rollLog flag in finally and "forget" about 
> this request.
> AsyncWAL in turn never requests it again because its own rollRequested field 
> is set and it expects a callback. Logs don't get rolled until a periodic roll 
> is triggered after that.
> The acknowledgment of roll requests by LogRoller should be atomic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21401) Sanity check when constructing the KeyValue

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724291#comment-16724291
 ] 

stack commented on HBASE-21401:
---

Pardon me [~openinx] I missed this. So, I looked at the patch and I still see 
double-parse, no? (Once to check byte array contains a wholesome KV and then 
the usual parse that happens as part of KV usage?).  Was thinking we could 
check wholesomeness inline with use? (Pardon me if I'm not following along the 
conversation -- I've switched out the context here).

> Sanity check when constructing the KeyValue
> ---
>
> Key: HBASE-21401
> URL: https://issues.apache.org/jira/browse/HBASE-21401
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
> Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch, 
> HBASE-21401.v3.patch, HBASE-21401.v4.patch, HBASE-21401.v4.patch, 
> HBASE-21401.v5.patch, HBASE-21401.v6.patch, HBASE-21401.v7.patch
>
>
> In KeyValueDecoder & ByteBuffKeyValueDecoder,  we pass a byte buffer to 
> initialize the Cell without a sanity check (check each field's offset 
> exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen 
> when read the cell's fields, such as HBASE-21379,  it's hard to debug this 
> kind of bug. 
> An earlier check will help to find such kind of bugs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724282#comment-16724282
 ] 

Hudson commented on HBASE-21592:


Results for branch branch-2.0
[build #1176 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1176/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1176//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1176//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/1176//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724236#comment-16724236
 ] 

stack commented on HBASE-21565:
---

+1 for branch-2.0 when it passes tests. This is a nice bug fix. Thanks 
[~tianjingyun]

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.branch-2.001.patch, 
> HBASE-21565.master.001.patch, HBASE-21565.master.002.patch, 
> HBASE-21565.master.003.patch, HBASE-21565.master.004.patch, 
> HBASE-21565.master.005.patch, HBASE-21565.master.006.patch, 
> HBASE-21565.master.007.patch, HBASE-21565.master.008.patch, 
> HBASE-21565.master.009.patch, HBASE-21565.master.010.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724227#comment-16724227
 ] 

stack commented on HBASE-21514:
---

Too late for next 2.0.4 and 2.1.2 RCs. This is a very nice patch though. There 
is risk with a patch of this size.  Should we just put it in branch-2.1 and not 
branch-2.1? You fellows need it in 2.0? Thanks [~zghaobac]

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724172#comment-16724172
 ] 

Hadoop QA commented on HBASE-21225:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
40s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  8m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
38s{color} | {color:red} hbase-client: The patch generated 4 new + 17 unchanged 
- 0 fixed = 21 total (was 17) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
26s{color} | {color:red} hbase-server: The patch generated 6 new + 2 unchanged 
- 1 fixed = 8 total (was 3) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
44s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  0s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m 20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
42s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-protocol in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
44s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}260m 36s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}341m 37s{color} | 
{color:black} {color} |
\\
\\
|| 

[jira] [Commented] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724170#comment-16724170
 ] 

Hadoop QA commented on HBASE-21225:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
31s{color} | {color:red} hbase-client: The patch generated 1 new + 17 unchanged 
- 0 fixed = 18 total (was 17) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
3s{color} | {color:red} hbase-server: The patch generated 6 new + 3 unchanged - 
0 fixed = 9 total (was 3) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
46s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
22s{color} | {color:green} hbase-protocol in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
21s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}266m 26s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}328m 18s{color} | 
{color:black} {color} |
\\
\\
|| 

[jira] [Commented] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724126#comment-16724126
 ] 

Hadoop QA commented on HBASE-21565:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
24s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
29s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
40s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 11s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}273m  1s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}323m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestFromClientSide3 |
|   | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.master.procedure.TestServerCrashProcedureWithReplicas |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21565 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952158/HBASE-21565.branch-2.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 1e74a0f0bd11 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 99de534cc4 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| unit | 

[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread ChenKai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724122#comment-16724122
 ] 

ChenKai commented on HBASE-21594:
-

I tried  both 0.98 and branch-1.2 just now, unlike you error. Here is my 
screenshot:

!image-2018-12-18-22-40-20-930.png|width=280,height=100!

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png, image-2018-12-18-22-40-20-930.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread ChenKai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChenKai updated HBASE-21594:

Attachment: image-2018-12-18-22-40-20-930.png

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png, image-2018-12-18-22-40-20-930.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724114#comment-16724114
 ] 

Hadoop QA commented on HBASE-21514:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  4m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
20s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 26s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}284m 41s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}341m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
|   | hadoop.hbase.client.TestSnapshotTemporaryDirectoryWithRegionReplicas |
|   | hadoop.hbase.client.TestConnectionImplementation |
|   | hadoop.hbase.replication.TestReplicationSyncUpTool |
|   | hadoop.hbase.regionserver.TestSplitTransactionOnCluster |
|   | hadoop.hbase.regionserver.TestRecoveredEdits |
|   | hadoop.hbase.client.TestFromClientSide |
|   | hadoop.hbase.client.TestFromClientSideWithCoprocessor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21514 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952154/HBASE-21514.master.addendum.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 531c24b35f5a 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HBASE-21588) Procedure v2 wal splitting implementation

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724087#comment-16724087
 ] 

Hadoop QA commented on HBASE-21588:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
49s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
19s{color} | {color:red} hbase-server: The patch generated 19 new + 294 
unchanged - 0 fixed = 313 total (was 294) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
34s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
30s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
23s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}141m 44s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
32s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}206m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestRecoveredEdits |
|   | hadoop.hbase.master.TestMasterWalManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | 

[jira] [Commented] (HBASE-21512) Introduce an AsyncClusterConnection and replace the usage of ClusterConnection

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724086#comment-16724086
 ] 

Hudson commented on HBASE-21512:


Results for branch HBASE-21512
[build #21 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/21/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/21//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/21//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-21512/21//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Introduce an AsyncClusterConnection and replace the usage of ClusterConnection
> --
>
> Key: HBASE-21512
> URL: https://issues.apache.org/jira/browse/HBASE-21512
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
>
> At least for the RSProcedureDispatcher, with CompletableFuture we do not need 
> to set a delay and use a thread pool any more, which could reduce the 
> resource usage and also the latency.
> Once this is done, I think we can remove the ClusterConnection completely, 
> and start to rewrite the old sync client based on the async client, which 
> could reduce the code base a lot for our client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21563) HBase Get Encounters java.lang.IndexOutOfBoundsException

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724069#comment-16724069
 ] 

Zheng Hu commented on HBASE-21563:
--

I've tried to parse the hfile,  Yeah, it threw IndexOutOfBoundsException. Let 
me dig.

> HBase Get Encounters java.lang.IndexOutOfBoundsException
> 
>
> Key: HBASE-21563
> URL: https://issues.apache.org/jira/browse/HBASE-21563
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 1.2.0
>Reporter: William Shen
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 67a04bc049be4f58afecdcc0a3ba62ca.tar.gz
>
>
> We've recently encountered issue retrieving data from our HBase cluster, and 
> have not had much luck troubleshooting the issue. We narrowed down our issue 
> to a single GET, which appears to be caused by FastDiffDeltaEncoder.java 
> running into java.lang.IndexOutOfBoundsException. 
> Perhaps there is a bug on a corner case for FastDiffDeltaEncoder? 
> We are running 1.2.0-cdh5.9.2, and the GET in question is:
> {noformat}
> hbase(main):004:0> get 'qa2.ADGROUPS', 
> "\x05\x80\x00\x00\x00\x00\x1F\x54\x9C\x80\x00\x00\x00\x00\x1C\x7D\x45\x00\x04\x80\x00\x00\x00\x00\x1D\x0F\x19\x80\x00\x00\x00\x00\x4A\x64\x6F\x80\x00\x00\x00\x01\xD9\xDB\xCE"
> COLUMNCELL
>   
>
> ERROR: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2215)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:109)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:185)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:165)
> Caused by: java.lang.IndexOutOfBoundsException
> at java.nio.Buffer.checkBounds(Buffer.java:567)
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:149)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decode(FastDiffDeltaEncoder.java:465)
> at 
> org.apache.hadoop.hbase.io.encoding.FastDiffDeltaEncoder$1.decodeNext(FastDiffDeltaEncoder.java:516)
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.next(BufferedDataBlockEncoder.java:618)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.next(HFileReaderV2.java:1277)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:180)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:108)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:588)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:147)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.populateResult(HRegion.java:5706)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:5865)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:5643)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5620)
> at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:5606)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6801)
> at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:6779)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2029)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33644)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2170)
> ... 3 more {noformat}
> Likewise, running {{ hbase hfile -f -p }} on the specific hfile, a subset of 
> kv pairs were printed until the program hits the following exception and 
> crashes:
> {noformat}
> Exception in thread "main" java.lang.RuntimeException: Unknown code 65
> at org.apache.hadoop.hbase.KeyValue$Type.codeToType(KeyValue.java:259)
> at org.apache.hadoop.hbase.KeyValue.keyToString(KeyValue.java:1246)
> at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$ClonedSeekerState.toString(BufferedDataBlockEncoder.java:506)
> at java.lang.String.valueOf(String.java:2994)
> at java.lang.StringBuilder.append(StringBuilder.java:131)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.scanKeysValues(HFilePrettyPrinter.java:382)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.processFile(HFilePrettyPrinter.java:316)
> at 
> org.apache.hadoop.hbase.io.hfile.HFilePrettyPrinter.run(HFilePrettyPrinter.java:255)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at 
> 

[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724068#comment-16724068
 ] 

Hadoop QA commented on HBASE-21514:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 37 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
17s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
32s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 46s{color} 
| {color:red} hbase-server generated 5 new + 183 unchanged - 5 fixed = 188 
total (was 188) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
19s{color} | {color:green} hbase-server: The patch generated 0 new + 1045 
unchanged - 61 fixed = 1045 total (was 1106) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
16s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m  6s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}245m 27s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}288m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAdmin1 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21514 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12952153/HBASE-21514.branch-2.003.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 6958436eb88f 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / ee214cbe40 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| javac | 
https://builds.apache.org/job/PreCommit-HBASE-Build/15311/artifact/patchprocess/diff-compile-javac-hbase-server.txt
 |
| unit | 

[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724051#comment-16724051
 ] 

Zheng Hu commented on HBASE-21594:
--

I tried both 0.98 and branch-1, the same exception as above.

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread ChenKai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724028#comment-16724028
 ] 

ChenKai commented on HBASE-21594:
-

Strangely.What's you hbase version? My version is 0.98.x or 1.2.x.

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread ChenKai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724006#comment-16724006
 ] 

ChenKai commented on HBASE-21594:
-

What puzzles me is the phenomenon is a very, very accidental in our production 
environment. And i test my bulk load code locally before, it's very ok.

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16724008#comment-16724008
 ] 

Zheng Hu commented on HBASE-21594:
--

I run the test by using you code and hfile, seems not the same problem ? 
{code}
Exception in thread "main" 
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile 
Trailer from file 
file:/home/openinx/Downloads/2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak
at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:528)
at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:576)
at org.apache.hadoop.hbase.io.TestHFile.main(TestHFile.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Caused by: java.lang.IllegalArgumentException: Invalid HFile version: 5593857 
(expected to be between 2 and 3)
at 
org.apache.hadoop.hbase.io.hfile.HFile.checkFormatVersion(HFile.java:931)
at 
org.apache.hadoop.hbase.io.hfile.FixedFileTrailer.readFromStream(FixedFileTrailer.java:402)
at org.apache.hadoop.hbase.io.hfile.HFile.openReader(HFile.java:513)
... 7 more
{code}

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723995#comment-16723995
 ] 

Zheng Hu commented on HBASE-21594:
--

Hi, I'm working on this now.  Could you share bulk load code which generate the 
misencoding hfile ? 

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread ChenKai (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723986#comment-16723986
 ] 

ChenKai commented on HBASE-21594:
-

Is there any suggestion for me to make a patch for our environment? Thanks.

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21594) Requested block is out of range when reading hfile

2018-12-18 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu reassigned HBASE-21594:


Assignee: Zheng Hu

> Requested block is out of range when reading hfile
> --
>
> Key: HBASE-21594
> URL: https://issues.apache.org/jira/browse/HBASE-21594
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Affects Versions: 0.98.10
>Reporter: ChenKai
>Assignee: Zheng Hu
>Priority: Major
> Attachments: 2e4db6a7f8d14f2bb846781b07750dc1_SeqId_84_.bak, 
> image-2018-12-13-20-11-00-818.png
>
>
> My HFiles are generated by Spark HBaseBulkLoad. And then when i read a few of 
> them(or hbase do compact), i encounter the following exceptions.
>  
> {code:java}
> Exception in thread "main" java.io.IOException: Requested block is out of 
> range: 77329641, lastDataBlockOffset: 77329641, 
> trailer.getLoadOnOpenDataOffset: 77329641
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:396)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.readNextDataBlock(HFileReaderV2.java:734)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.isNextBlock(HFileReaderV2.java:859)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.positionForNextBlock(HFileReaderV2.java:854)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2._next(HFileReaderV2.java:871)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.next(HFileReaderV2.java:891)
> at io.patamon.hbase.test.read.TestHFileRead.main(TestHFileRead.java:49)
> {code}
> Looks like `lastDataBlockOffset` is equals to 
> `trailer.getLoadOnOpenDataOffset`. Could anyone help me? Thanks very much.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723962#comment-16723962
 ] 

Anoop Sam John commented on HBASE-21514:


Thanks for the nice cleanup/ improvement..   Similar is required in the MSLAB 
chunks and pool level too..  Will be doing soon.  
cc [~ram_krish]

> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723959#comment-16723959
 ] 

Hudson commented on HBASE-21592:


SUCCESS: Integrated in Jenkins build HBase-1.2-IT #1193 (See 
[https://builds.apache.org/job/HBase-1.2-IT/1193/])
HBASE-21592 quota.addGetResult(r) throw NPE (openinx: rev 
cdc40767eb3d486d350d857d449e6ac847e2bdb8)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestQuotaThrottle.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java


> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Zheng Hu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-21592:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21592) quota.addGetResult(r) throw NPE

2018-12-18 Thread Zheng Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723940#comment-16723940
 ] 

Zheng Hu commented on HBASE-21592:
--

Pushed to branch-1/branch-1.2/branch-1.4/branch-2/branch-2.1/branch-2.2/master, 
 Thanks [~xu qinya]  for contribution. 

> quota.addGetResult(r)  throw  NPE
> -
>
> Key: HBASE-21592
> URL: https://issues.apache.org/jira/browse/HBASE-21592
> Project: HBase
>  Issue Type: Bug
>Reporter: xuqinya
>Assignee: xuqinya
>Priority: Major
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.2.10, 1.4.10, 2.1.3, 2.0.5
>
> Attachments: HBASE-21592.branch-1.0001.patch, 
> HBASE-21592.branch-2.0001.patch, HBASE-21592.master.0001.patch, 
> HBASE-21592.master.0002.patch, HBASE-21592.master.0003.patch, 
> HBASE-21592.master.0004.patch
>
>
> Setting the RPC quota, table.exists(Get) cause quota.addGetResult(r)  throw  
> NPE.
> {code:java}
> set_quota TYPE => THROTTLE, NAMESPACE => 'ns1', LIMIT => '1000req/sec'
> {code}
> {code:java}
> Connection conn = ConnectionFactory.createConnection(config);
> Table htable = conn.getTable(TableName.valueOf("ns1:t1"));
> boolean exists = htable.exists(new Get(Bytes.toBytes("123"))); {code}
> log:
> java.io.IOException: java.io.IOException
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2183)
>  at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
>  at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.lang.NullPointerException
>  at 
> org.apache.hadoop.hbase.quotas.QuotaUtil.calculateResultSize(QuotaUtil.java:282)
>  at 
> org.apache.hadoop.hbase.quotas.DefaultOperationQuota.addGetResult(DefaultOperationQuota.java:99)
>  at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:1907)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32381)
>  at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2135)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21514) Refactor CacheConfig

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723947#comment-16723947
 ] 

Hudson commented on HBASE-21514:


Results for branch master
[build #669 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/669/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/669//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/669//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/669//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Refactor CacheConfig
> 
>
> Key: HBASE-21514
> URL: https://issues.apache.org/jira/browse/HBASE-21514
> Project: HBase
>  Issue Type: Improvement
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21514.branch-2.001.patch, 
> HBASE-21514.branch-2.002.patch, HBASE-21514.branch-2.003.patch, 
> HBASE-21514.master.001.patch, HBASE-21514.master.002.patch, 
> HBASE-21514.master.003.patch, HBASE-21514.master.004.patch, 
> HBASE-21514.master.005.patch, HBASE-21514.master.006.patch, 
> HBASE-21514.master.007.patch, HBASE-21514.master.008.patch, 
> HBASE-21514.master.009.patch, HBASE-21514.master.010.patch, 
> HBASE-21514.master.011.patch, HBASE-21514.master.011.patch, 
> HBASE-21514.master.012.patch, HBASE-21514.master.013.patch, 
> HBASE-21514.master.013.patch, HBASE-21514.master.014.patch, 
> HBASE-21514.master.addendum.patch
>
>
> # Add block cache and mob file cache to HRegionServer's member variable. One 
> rs has one block cache and one mob file cache.
>  # Move the global cache instances from CacheConfig to BlockCacheFactory. 
> Only keep config stuff in CacheConfig. And the CacheConfig still have a 
> reference to the RegionServer's block cache. Whether to cache a block need 
> block cache is present and the related config is true.
>  # Remove MobCacheCofnig. It only used for the global mob file cache 
> instance. After move the mob file cache to RegionServer. It is not used 
> anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21589) TestCleanupMetaWAL fails

2018-12-18 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723945#comment-16723945
 ] 

Hudson commented on HBASE-21589:


Results for branch master
[build #669 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/669/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/669//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/669//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/669//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> TestCleanupMetaWAL fails
> 
>
> Key: HBASE-21589
> URL: https://issues.apache.org/jira/browse/HBASE-21589
> Project: HBase
>  Issue Type: Bug
>  Components: test, wal
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.1.2, 2.0.4
>
> Attachments: HBASE-21589.branch-2.0.001.patch, 
> org.apache.hadoop.hbase.regionserver.TestCleanupMetaWAL-output.txt
>
>
> This test fails near all-the-time. Sunk two RCs. Fix. Made it a blocker.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21588) Procedure v2 wal splitting implementation

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21588:
-
Attachment: HBASE-21588.master.003.patch

> Procedure v2 wal splitting implementation
> -
>
> Key: HBASE-21588
> URL: https://issues.apache.org/jira/browse/HBASE-21588
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21588.master.003.patch
>
>
> create a sub task to submit the implementation of procedure v2 wal splitting



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21588) Procedure v2 wal splitting implementation

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21588:
-
Attachment: (was: HBASE-21588.master.001.patch)

> Procedure v2 wal splitting implementation
> -
>
> Key: HBASE-21588
> URL: https://issues.apache.org/jira/browse/HBASE-21588
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
>
> create a sub task to submit the implementation of procedure v2 wal splitting



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21588) Procedure v2 wal splitting implementation

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21588:
-
Attachment: (was: HBASE-21588.master.002.patch)

> Procedure v2 wal splitting implementation
> -
>
> Key: HBASE-21588
> URL: https://issues.apache.org/jira/browse/HBASE-21588
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Major
> Attachments: HBASE-21588.master.001.patch
>
>
> create a sub task to submit the implementation of procedure v2 wal splitting



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'

2018-12-18 Thread Sakthi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sakthi updated HBASE-21225:
---
Attachment: hbase-21225.master.003.patch

> Having RPC & Space quota on a table/Namespace doesn't allow space quota to be 
> removed using 'NONE'
> --
>
> Key: HBASE-21225
> URL: https://issues.apache.org/jira/browse/HBASE-21225
> Project: HBase
>  Issue Type: Bug
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Major
> Attachments: hbase-21225.master.001.patch, 
> hbase-21225.master.002.patch, hbase-21225.master.003.patch
>
>
> A part of HBASE-20705 is still unresolved. In that Jira it was assumed that 
> problem is: when table having both rpc & space quotas is dropped (with 
> hbase.quota.remove.on.table.delete set as true), the rpc quota is not set to 
> be dropped along with table-drops, and space quota was not being able to be 
> removed completely because of the "EMPTY" row that rpc quota left even after 
> removing. 
> The proposed solution for that was to make sure that rpc quota didn't leave 
> empty rows after removal of quota. And setting automatic removal of rpc quota 
> with table drops. That made sure that space quotas can be recreated/removed.
> But all this was under the assumption that hbase.quota.remove.on.table.delete 
> is set as true. When it is set as false, the same issue can reproduced. Also 
> the below shown steps can used to reproduce the issue without table-drops.
> {noformat}
> hbase(main):005:0> create 't2','cf'
> Created table t2
> Took 0.7619 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.0514 seconds
> hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0162 seconds
> hbase(main):008:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, 
> LIMIT => 10M/sec, SCOPE =>
>MACHINE
>  TABLE => t2   TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, 
> VIOLATION_POLICY => NO_WRIT
>ES
> 2 row(s)
> Took 0.0716 seconds
> hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE
> Took 0.0082 seconds
> hbase(main):010:0> list_quotas
> OWNER   QUOTAS
>  TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
>  TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true
> 2 row(s)
> Took 0.0254 seconds
> hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0082 seconds
> hbase(main):012:0> list_quotas
> OWNER   QUOTAS
>  TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
>  TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true
> 2 row(s)
> Took 0.0411 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'

2018-12-18 Thread Sakthi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16723892#comment-16723892
 ] 

Sakthi commented on HBASE-21225:


Have uploaded a patch with "remove" attribute from SpaceQuota protobuf removed. 
Waiting for the unit tests reports from 'Hadoop QA'.

> Having RPC & Space quota on a table/Namespace doesn't allow space quota to be 
> removed using 'NONE'
> --
>
> Key: HBASE-21225
> URL: https://issues.apache.org/jira/browse/HBASE-21225
> Project: HBase
>  Issue Type: Bug
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Major
> Attachments: hbase-21225.master.001.patch, 
> hbase-21225.master.002.patch, hbase-21225.master.003.patch
>
>
> A part of HBASE-20705 is still unresolved. In that Jira it was assumed that 
> problem is: when table having both rpc & space quotas is dropped (with 
> hbase.quota.remove.on.table.delete set as true), the rpc quota is not set to 
> be dropped along with table-drops, and space quota was not being able to be 
> removed completely because of the "EMPTY" row that rpc quota left even after 
> removing. 
> The proposed solution for that was to make sure that rpc quota didn't leave 
> empty rows after removal of quota. And setting automatic removal of rpc quota 
> with table drops. That made sure that space quotas can be recreated/removed.
> But all this was under the assumption that hbase.quota.remove.on.table.delete 
> is set as true. When it is set as false, the same issue can reproduced. Also 
> the below shown steps can used to reproduce the issue without table-drops.
> {noformat}
> hbase(main):005:0> create 't2','cf'
> Created table t2
> Took 0.7619 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.0514 seconds
> hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0162 seconds
> hbase(main):008:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, 
> LIMIT => 10M/sec, SCOPE =>
>MACHINE
>  TABLE => t2   TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, 
> VIOLATION_POLICY => NO_WRIT
>ES
> 2 row(s)
> Took 0.0716 seconds
> hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE
> Took 0.0082 seconds
> hbase(main):010:0> list_quotas
> OWNER   QUOTAS
>  TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
>  TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true
> 2 row(s)
> Took 0.0254 seconds
> hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0082 seconds
> hbase(main):012:0> list_quotas
> OWNER   QUOTAS
>  TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
>  TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true
> 2 row(s)
> Took 0.0411 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21225) Having RPC & Space quota on a table/Namespace doesn't allow space quota to be removed using 'NONE'

2018-12-18 Thread Sakthi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sakthi updated HBASE-21225:
---
Attachment: hbase-21225.master.002.patch

> Having RPC & Space quota on a table/Namespace doesn't allow space quota to be 
> removed using 'NONE'
> --
>
> Key: HBASE-21225
> URL: https://issues.apache.org/jira/browse/HBASE-21225
> Project: HBase
>  Issue Type: Bug
>Reporter: Sakthi
>Assignee: Sakthi
>Priority: Major
> Attachments: hbase-21225.master.001.patch, 
> hbase-21225.master.002.patch
>
>
> A part of HBASE-20705 is still unresolved. In that Jira it was assumed that 
> problem is: when table having both rpc & space quotas is dropped (with 
> hbase.quota.remove.on.table.delete set as true), the rpc quota is not set to 
> be dropped along with table-drops, and space quota was not being able to be 
> removed completely because of the "EMPTY" row that rpc quota left even after 
> removing. 
> The proposed solution for that was to make sure that rpc quota didn't leave 
> empty rows after removal of quota. And setting automatic removal of rpc quota 
> with table drops. That made sure that space quotas can be recreated/removed.
> But all this was under the assumption that hbase.quota.remove.on.table.delete 
> is set as true. When it is set as false, the same issue can reproduced. Also 
> the below shown steps can used to reproduce the issue without table-drops.
> {noformat}
> hbase(main):005:0> create 't2','cf'
> Created table t2
> Took 0.7619 seconds
> => Hbase::Table - t2
> hbase(main):006:0> set_quota TYPE => THROTTLE, TABLE => 't2', LIMIT => 
> '10M/sec'
> Took 0.0514 seconds
> hbase(main):007:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0162 seconds
> hbase(main):008:0> list_quotas
> OWNER  QUOTAS
>  TABLE => t2   TYPE => THROTTLE, THROTTLE_TYPE => REQUEST_SIZE, 
> LIMIT => 10M/sec, SCOPE =>
>MACHINE
>  TABLE => t2   TYPE => SPACE, TABLE => t2, LIMIT => 1073741824, 
> VIOLATION_POLICY => NO_WRIT
>ES
> 2 row(s)
> Took 0.0716 seconds
> hbase(main):009:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => NONE
> Took 0.0082 seconds
> hbase(main):010:0> list_quotas
> OWNER   QUOTAS
>  TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
>  TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true
> 2 row(s)
> Took 0.0254 seconds
> hbase(main):011:0> set_quota TYPE => SPACE, TABLE => 't2', LIMIT => '1G', 
> POLICY => NO_WRITES
> Took 0.0082 seconds
> hbase(main):012:0> list_quotas
> OWNER   QUOTAS
>  TABLE => t2TYPE => THROTTLE, THROTTLE_TYPE => 
> REQUEST_SIZE, LIMIT => 10M/sec, SCOPE => MACHINE
>  TABLE => t2TYPE => SPACE, TABLE => t2, REMOVE => true
> 2 row(s)
> Took 0.0411 seconds
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21572) The "progress" object in "Compactor" is not thread-safe, this may cause the misleading progress information on the web UI.

2018-12-18 Thread lixiaobao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lixiaobao updated HBASE-21572:
--
Description: 
when setting the compaction thread number more than 1, on the store, there may 
be multiple threads on the region server using "compactor" of the "store" to 
execute the compaction . However, the "progress" object in "Compactor" is not 
thread-safe, this may cause the misleading progress information on the web UI.

The problem is:
 # If the memstore frequent flush , there may be two or more compaction request 
on one store, however, one "store" has one "compactor" and one "compactor" has 
one "progress",when two threads execute compaction on one store ,the code below 
may have some problem."progress" will be override by lastest thread.
{code:java}
this.progress = new CompactionProgress(fd.maxKeyCount);{code}

 # The code below may also case thread-safe problem when two or more threads 
execute compaction on one store
{code:java}
 ++progress.currentCompactedKVs;
 progress.totalCompactedSize += len;{code}

solutions:
 # I create a list of "CompactionProgress" in the "compactor" ,every thread 
execute compaction just add progress to the list,when complete remove 
"progress" in the list. 

  was:
when setting the compaction thread number more than 1, on the store, there may 
be multiple threads on the region server using "compactor" of the "store" to 
execute the compaction . However, the "progress" object in "Compactor" is not 
thread-safe, this may cause the misleading progress information on the web UI.

The problem is:
 # If the memstore frequent flush , there may be two or more compaction request 
on one store, however, one "store" has one "compactor" and one "compactor" has 
one "progress",when two threads execute compaction on one store ,the code below 
may have some problem."progress" will be override by lastest thread.
{code:java}
this.progress = new CompactionProgress(fd.maxKeyCount);{code}

 # The code below may also case thread-safe problem when two or more threads 
execute compaction on one store
{code:java}
 ++progress.currentCompactedKVs;
 progress.totalCompactedSize += len;{code}

solutions:
 # I create a list of "CompactionProgress" in the compactor of one store,every 
thread execute compaction just add progress to the list,when compelete remove 
progress in the list.

 


>  The "progress" object in "Compactor" is not thread-safe, this may cause the 
> misleading progress information on the web UI.
> ---
>
> Key: HBASE-21572
> URL: https://issues.apache.org/jira/browse/HBASE-21572
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, UI
>Affects Versions: 1.2.0, 3.0.0, 1.3.0, 1.4.0, 2.1.0, 2.0.0
>Reporter: lixiaobao
>Assignee: lixiaobao
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21572.patch
>
>
> when setting the compaction thread number more than 1, on the store, there 
> may be multiple threads on the region server using "compactor" of the "store" 
> to execute the compaction . However, the "progress" object in "Compactor" is 
> not thread-safe, this may cause the misleading progress information on the 
> web UI.
> The problem is:
>  # If the memstore frequent flush , there may be two or more compaction 
> request on one store, however, one "store" has one "compactor" and one 
> "compactor" has one "progress",when two threads execute compaction on one 
> store ,the code below may have some problem."progress" will be override by 
> lastest thread.
> {code:java}
> this.progress = new CompactionProgress(fd.maxKeyCount);{code}
>  # The code below may also case thread-safe problem when two or more threads 
> execute compaction on one store
> {code:java}
>  ++progress.currentCompactedKVs;
>  progress.totalCompactedSize += len;{code}
> solutions:
>  # I create a list of "CompactionProgress" in the "compactor" ,every thread 
> execute compaction just add progress to the list,when complete remove 
> "progress" in the list. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21572) The "progress" object in "Compactor" is not thread-safe, this may cause the misleading progress information on the web UI.

2018-12-18 Thread lixiaobao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lixiaobao updated HBASE-21572:
--
Component/s: UI

>  The "progress" object in "Compactor" is not thread-safe, this may cause the 
> misleading progress information on the web UI.
> ---
>
> Key: HBASE-21572
> URL: https://issues.apache.org/jira/browse/HBASE-21572
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction, UI
>Affects Versions: 1.2.0, 3.0.0, 1.3.0, 1.4.0, 2.1.0, 2.0.0
>Reporter: lixiaobao
>Assignee: lixiaobao
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: HBASE-21572.patch
>
>
> when setting the compaction thread number more than 1, on the store, there 
> may be multiple threads on the region server using "compactor" of the "store" 
> to execute the compaction . However, the "progress" object in "Compactor" is 
> not thread-safe, this may cause the misleading progress information on the 
> web UI.
> The problem is:
>  # If the memstore frequent flush , there may be two or more compaction 
> request on one store, however, one "store" has one "compactor" and one 
> "compactor" has one "progress",when two threads execute compaction on one 
> store ,the code below may have some problem."progress" will be override by 
> lastest thread.
> {code:java}
> this.progress = new CompactionProgress(fd.maxKeyCount);{code}
>  # The code below may also case thread-safe problem when two or more threads 
> execute compaction on one store
> {code:java}
>  ++progress.currentCompactedKVs;
>  progress.totalCompactedSize += len;{code}
> solutions:
>  # I create a list of "CompactionProgress" in the compactor of one 
> store,every thread execute compaction just add progress to the list,when 
> compelete remove progress in the list.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21565) Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server

2018-12-18 Thread Jingyun Tian (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-21565:
-
Attachment: HBASE-21565.branch-2.001.patch

> Delete dead server from dead server list too early leads to concurrent Server 
> Crash Procedures(SCP) for a same server
> -
>
> Key: HBASE-21565
> URL: https://issues.apache.org/jira/browse/HBASE-21565
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jingyun Tian
>Assignee: Jingyun Tian
>Priority: Critical
> Attachments: HBASE-21565.branch-2.001.patch, 
> HBASE-21565.master.001.patch, HBASE-21565.master.002.patch, 
> HBASE-21565.master.003.patch, HBASE-21565.master.004.patch, 
> HBASE-21565.master.005.patch, HBASE-21565.master.006.patch, 
> HBASE-21565.master.007.patch, HBASE-21565.master.008.patch, 
> HBASE-21565.master.009.patch, HBASE-21565.master.010.patch
>
>
> There are 2 kinds of SCP for a same server will be scheduled during cluster 
> restart, one is ZK session timeout, the other one is new server report in 
> will cause the stale one do fail over. The only barrier for these 2 kinds of 
> SCP is check if the server is in the dead server list.
> {code}
> if (this.deadservers.isDeadServer(serverName)) {
>   LOG.warn("Expiration called on {} but crash processing already in 
> progress", serverName);
>   return false;
> }
> {code}
> But the problem is when master finish initialization, it will delete all 
> stale servers from dead server list. Thus when the SCP for ZK session timeout 
> come in, the barrier is already removed.
> Here is the logs that how this problem occur.
> {code}
> 2018-12-07,11:42:37,589 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=9, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> 2018-12-07,11:42:58,007 INFO 
> org.apache.hadoop.hbase.master.procedure.ServerCrashProcedure: Start pid=444, 
> state=RUNNABLE:SERVER_CRASH_START, hasLock=true; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false
> {code}
> Now we can see two SCP are scheduled for the same server.
> But the first procedure is finished after the second SCP starts.
> {code}
> 2018-12-07,11:43:08,038 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=9, 
> state=SUCCESS, hasLock=false; ServerCrashProcedure 
> server=c4-hadoop-tst-st27.bj,29100,1544153846859, splitWal=true, meta=false 
> in 30.5340sec
> {code}
> Thus it will leads the problem that regions will be assigned twice.
> {code}
> 2018-12-07,12:16:33,039 WARN 
> org.apache.hadoop.hbase.master.assignment.AssignmentManager: rit=OPEN, 
> location=c4-hadoop-tst-st28.bj,29100,1544154149607, table=test_failover, 
> region=459b3130b40caf3b8f3e1421766f4089 reported OPEN on 
> server=c4-hadoop-tst-st29.bj,29100,1544154149615 but state has otherwise
> {code}
> And here we can see the server is removed from dead server list before the 
> second SCP starts.
> {code}
> 2018-12-07,11:42:44,938 DEBUG org.apache.hadoop.hbase.master.DeadServer: 
> Removed c4-hadoop-tst-st27.bj,29100,1544153846859 ; numProcessing=3
> {code}
> Thus we should not delete dead server from dead server list immediately.
> Patch to fix this problem will be upload later.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >