[jira] [Updated] (HBASE-19024) provide a configurable option to hsync WAL edits to the disk for better durability

2017-10-31 Thread Harshal Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshal Jain updated HBASE-19024:
-
Attachment: master.v5.patch

> provide a configurable option to hsync WAL edits to the disk for better 
> durability
> --
>
> Key: HBASE-19024
> URL: https://issues.apache.org/jira/browse/HBASE-19024
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
> Environment: 
>Reporter: Vikas Vishwakarma
>Assignee: Harshal Jain
>Priority: Major
> Attachments: branch-1.branch-1.patch, branch-1.v1.branch-1.patch, 
> master.patch, master.v2.patch, master.v3.patch, master.v5.patch, 
> master.v5.patch
>
>
> At present we do not have an option to hsync WAL edits to the disk for better 
> durability. In our local tests we see 10-15% latency impact of using hsync 
> instead of hflush which is not very high.  
> We should have a configurable option to hysnc WAL edits instead of just 
> sync/hflush which will call the corresponding API on the hadoop side. 
> Currently HBase handles both SYNC_WAL and FSYNC_WAL as the same calling 
> FSDataOutputStream sync/hflush on the hadoop side. This can be modified to 
> let FSYNC_WAL call hsync on the hadoop side instead of sync/hflush. We can 
> keep the default value to sync as the current behavior and hsync can be 
> enabled based on explicit configuration.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19138) Rare failure in TestLruBlockCache

2017-10-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233660#comment-16233660
 ] 

Hudson commented on HBASE-19138:


SUCCESS: Integrated in Jenkins build HBase-1.4 #982 (See 
[https://builds.apache.org/job/HBase-1.4/982/])
HBASE-19138 Rare failure in TestLruBlockCache (apurtell: rev 
5af2c1a3aea69a7a9a9b061f5cc6297918894aa5)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java


> Rare failure in TestLruBlockCache
> -
>
> Key: HBASE-19138
> URL: https://issues.apache.org/jira/browse/HBASE-19138
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>
> Attachments: HBASE-19138.patch
>
>
> [ERROR] 
> testCacheEvictionThreadSafe(org.apache.hadoop.hbase.io.hfile.TestLruBlockCache)
>  Time elapsed: 0.028 s <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<1>
> at 
> org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.testCacheEvictionThreadSafe(TestLruBlockCache.java:86)
> Fix for this I think is waiting for the block count to drop to zero after 
> awaiting shutdown of the executor pool. There's a rare race here. Tests look 
> good using a Waiter predicate so far, but I haven't finished all 100 
> iterations yet. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same

2017-10-31 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233661#comment-16233661
 ] 

Anoop Sam John commented on HBASE-16890:


Lets work on this after Alpha-4.  This is a very imp jira for 2.0

> Analyze the performance of AsyncWAL and fix the same
> 
>
> Key: HBASE-16890
> URL: https://issues.apache.org/jira/browse/HBASE-16890
> Project: HBase
>  Issue Type: Sub-task
>  Components: wal
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
> Attachments: AsyncWAL_disruptor.patch, AsyncWAL_disruptor_1 
> (2).patch, AsyncWAL_disruptor_3.patch, AsyncWAL_disruptor_3.patch, 
> AsyncWAL_disruptor_4.patch, AsyncWAL_disruptor_6.patch, 
> HBASE-16890-rc-v2.patch, HBASE-16890-rc-v3.patch, 
> HBASE-16890-remove-contention-v1.patch, HBASE-16890-remove-contention.patch, 
> Screen Shot 2016-10-25 at 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07 
> PM.png, Screen Shot 2016-10-25 at 7.39.48 PM.png, Screen Shot 2016-11-04 at 
> 5.21.27 PM.png, Screen Shot 2016-11-04 at 5.30.18 PM.png, async.svg, 
> classic.svg, contention.png, contention_defaultWAL.png
>
>
> Tests reveal that AsyncWAL under load in single node cluster performs slower 
> than the Default WAL. This task is to analyze and see if we could fix it.
> See some discussions in the tail of JIRA HBASE-15536.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17204) Make L2 off heap cache default ON

2017-10-31 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233658#comment-16233658
 ] 

Anoop Sam John commented on HBASE-17204:


Working on this.. Doing some more perf tests based on Alpha-3 release and on 
smaller data size.  Will do long run test also soon.

> Make L2 off heap cache default ON
> -
>
> Key: HBASE-17204
> URL: https://issues.apache.org/jira/browse/HBASE-17204
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
>
> L2 cache can be used for data blocks.  By default it is off now. After 
> HBASE-11425 work, L2 off heap cache can equally perform with L1 on heap 
> cache. On heavy loaded workload, this can even out perform L1 cache.  Pls see 
> recently published report by Alibaba.  Also this work was backported by 
> Rocketfuel and similar perf improvement report from them too.
> Let us turn L2 off heap cache ON. As it is off heap, we can have much larger 
> sized L2 BC.  What should be the default size?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17065) Perform more effective sorting for RPC Handler Tasks

2017-10-31 Thread Reid Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233633#comment-16233633
 ] 

Reid Chan commented on HBASE-17065:
---

Hi [~tedyu]
i'm working on this, but not have a moment to test, please leave me some time.

> Perform more effective sorting for RPC Handler Tasks
> 
>
> Key: HBASE-17065
> URL: https://issues.apache.org/jira/browse/HBASE-17065
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>  Labels: rpc, ui
> Attachments: rpc-hdlr-tasks.png
>
>
> Under the 'Show All RPC Handler Tasks' tab, the tasks are not sorted 
> according to the duration they have been in the current state (see picture).
> In TaskMonitorTmpl.jamon :
> {code}
> long now = System.currentTimeMillis();
> Collections.reverse(tasks);
> {code}
> The underlying tasks are backed by CircularFifoBuffer, leading to ineffective 
> sorting.
> We should display tasks with the longest duration (in current state) first.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17065) Perform more effective sorting for RPC Handler Tasks

2017-10-31 Thread Reid Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233634#comment-16233634
 ] 

Reid Chan commented on HBASE-17065:
---

If you don't mind.

> Perform more effective sorting for RPC Handler Tasks
> 
>
> Key: HBASE-17065
> URL: https://issues.apache.org/jira/browse/HBASE-17065
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>  Labels: rpc, ui
> Attachments: rpc-hdlr-tasks.png
>
>
> Under the 'Show All RPC Handler Tasks' tab, the tasks are not sorted 
> according to the duration they have been in the current state (see picture).
> In TaskMonitorTmpl.jamon :
> {code}
> long now = System.currentTimeMillis();
> Collections.reverse(tasks);
> {code}
> The underlying tasks are backed by CircularFifoBuffer, leading to ineffective 
> sorting.
> We should display tasks with the longest duration (in current state) first.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-17065) Perform more effective sorting for RPC Handler Tasks

2017-10-31 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-17065:
---
Description: 
Under the 'Show All RPC Handler Tasks' tab, the tasks are not sorted according 
to the duration they have been in the current state (see picture).

In TaskMonitorTmpl.jamon :
{code}
long now = System.currentTimeMillis();
Collections.reverse(tasks);
{code}

The underlying tasks are backed by CircularFifoBuffer, leading to ineffective 
sorting.

We should display tasks with the longest duration (in current state) first.

  was:
Under the 'Show All RPC Handler Tasks' tab, the tasks are not sorted according 
to the duration they have been in the current state (see picture).

In TaskMonitorTmpl.jamon :
{code}
long now = System.currentTimeMillis();
Collections.reverse(tasks);
{code}
The underlying tasks are backed by CircularFifoBuffer, leading to ineffective 
sorting.

We should display tasks with the longest duration (in current state) first.


> Perform more effective sorting for RPC Handler Tasks
> 
>
> Key: HBASE-17065
> URL: https://issues.apache.org/jira/browse/HBASE-17065
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Minor
>  Labels: rpc, ui
> Attachments: rpc-hdlr-tasks.png
>
>
> Under the 'Show All RPC Handler Tasks' tab, the tasks are not sorted 
> according to the duration they have been in the current state (see picture).
> In TaskMonitorTmpl.jamon :
> {code}
> long now = System.currentTimeMillis();
> Collections.reverse(tasks);
> {code}
> The underlying tasks are backed by CircularFifoBuffer, leading to ineffective 
> sorting.
> We should display tasks with the longest duration (in current state) first.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-17499) Bound the total heap memory used for the rolling average of RegionLoads

2017-10-31 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-17499.

Resolution: Later

> Bound the total heap memory used for the rolling average of RegionLoads
> ---
>
> Key: HBASE-17499
> URL: https://issues.apache.org/jira/browse/HBASE-17499
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Priority: Major
>  Labels: balancer
>
> Currently "hbase.master.balancer.stochastic.numRegionLoadsToRemember" 
> controls the number of RegionLoads which are kept by StochasticLoadBalancer 
> for each region.
> The parameter doesn't take into account the number of regions in the cluster.
> Meaning, the amount of heap consumed by RegionLoads would be out of norm for 
> cluster with large number of regions.
> This issue is to see if we should bound the total heap memory used for the 
> rolling average of RegionLoads.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19127) Set State.SPLITTING, MERGING, MERGING_NEW, SPLITTING_NEW properly in RegionStatesNode

2017-10-31 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233629#comment-16233629
 ] 

Jerry He commented on HBASE-19127:
--

Will the change/shuffle of the state codes affect backward compatibility? For 
example, old proc wals can not be read correctly anymore?
If this is new code, it should be fine.  
Just asking.

> Set State.SPLITTING, MERGING, MERGING_NEW, SPLITTING_NEW properly in 
> RegionStatesNode
> -
>
> Key: HBASE-19127
> URL: https://issues.apache.org/jira/browse/HBASE-19127
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Major
> Attachments: state.patch
>
>
> In current code, we did not set above states to a region node at all, but we 
> still have statements like below to check if node have above states.
> {code}
> else if (!regionNode.isInState(State.CLOSING, State.SPLITTING)) {
> 
> }
> {code}
> We need to set above states in a correct place.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18827) Site build takes 1.5 hours; it looks like it is stuck...

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233625#comment-16233625
 ] 

stack commented on HBASE-18827:
---

Still taking for ever I need to figure why...

> Site build takes 1.5 hours; it looks like it is stuck...
> 
>
> Key: HBASE-18827
> URL: https://issues.apache.org/jira/browse/HBASE-18827
> Project: HBase
>  Issue Type: Bug
>  Components: site
>Reporter: stack
>Priority: Minor
>
> I'm trying to make a release. I'm in a tizzy as is usual around these times*. 
> 1.5 hours seems totally over-the-top. I think [~misty]'s lovely automation 
> has hidden this fact from us but needs digging on why we are taking so long. 
> The cycle seems to be provoked by hbase-archetypes module but I got 'mvn 
> log glaze disease' as soon as I tried digging in.
> Filing issue in case someone else wants to have a go at this before I. Also 
> filing if only to note that the thing does eventually finish in case I 
> forget
> * I go to build the doc/site and the build never seems to end. First there is 
> the issue HBASE-18821 where we were NPE'ing after a bunch of time had passed. 
> My fix for HBASE-18821 then got me further but after what seemed hours, it 
> failed with a cryptic message (Thanks [~Apache9] for figuring that I'd made 
> an incorrect fix over in HBASE-18821).  Next up I'm looking at a cycle that 
> never seems to end... only it eventually does after 90minutes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19102) TestZooKeeperMainServer fails with KeeperException$ConnectionLossException

2017-10-31 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-19102:
--
Attachment: HBASE-19102.master.002.patch

Retry. Stochastic balancer failed.

> TestZooKeeperMainServer fails with KeeperException$ConnectionLossException
> --
>
> Key: HBASE-19102
> URL: https://issues.apache.org/jira/browse/HBASE-19102
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Major
> Attachments: HBASE-19102.master.001.patch, 
> HBASE-19102.master.002.patch, HBASE-19102.master.002.patch
>
>
> I'm trying to run test suite on a local machine. I never get to the second 
> part because I fail on below test with below exception near every time (and 
> an ipv6 test... will do that next).  
> 1 
> ---
>   2 Test set: org.apache.hadoop.hbase.zookeeper.TestZooKeeperMainServer
>   3 
> ---
>   4 Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 16.161 s 
> <<< FAILURE! - in org.apache.hadoop.hbase.zookeeper.TestZooKeeperMainServer
>   5 
> testCommandLineWorks(org.apache.hadoop.hbase.zookeeper.TestZooKeeperMainServer)
>   Time elapsed: 15.848 s  <<< ERROR!
>   6 org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for /testCommandLineWorks
>   7   at 
> org.apache.hadoop.hbase.zookeeper.TestZooKeeperMainServer.testCommandLineWorks(TestZooKeeperMainServer.java:81)
> Looks like running the command before we are connected causes the above -- we 
> pause 15 seconds and then throw the above.   If I wait until connected before 
> proceding, stuff seems to work reliably. I don't have access to the watcher 
> on connections since we override the zk main class... so this seems only 
> avenue available at mo (This zk main thing is all a hack around zk main 
> because it had bugs ... but I think we have to keep the hack because folks 
> use different versions of zk. My $workplace defaults to something that is 
> years old, 3.4.5 for instance).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18624) Added support for clearing BlockCache based on table name

2017-10-31 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18624:
--
Status: In Progress  (was: Patch Available)

> Added support for clearing BlockCache based on table name
> -
>
> Key: HBASE-18624
> URL: https://issues.apache.org/jira/browse/HBASE-18624
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Ajay Jadhav
>Assignee: Zach York
>Priority: Major
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18624.branch-1.001.patch, 
> HBASE-18624.master.001.patch, HBASE-18624.master.002.patch, 
> HBASE-18624.master.003.patch, HBASE-18624.master.004.patch, 
> HBASE-18624.master.005.patch, HBASE-18624.master.006.patch, 
> HBASE-18624.master.007.patch, HBASE-18624.master.008.patch
>
>
> Bulk loading the primary HBase cluster triggers a lot of compactions 
> resulting in archival/ creation
> of multiple HFiles. This process will cause a lot of items to become stale in 
> replica’s BlockCache.
> This patch will help users to clear the block cache for a given table by 
> either using shell or API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18624) Added support for clearing BlockCache based on table name

2017-10-31 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18624:
--
Status: Patch Available  (was: In Progress)

> Added support for clearing BlockCache based on table name
> -
>
> Key: HBASE-18624
> URL: https://issues.apache.org/jira/browse/HBASE-18624
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 1.3.0, 2.0.0
>Reporter: Ajay Jadhav
>Assignee: Zach York
>Priority: Major
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18624.branch-1.001.patch, 
> HBASE-18624.master.001.patch, HBASE-18624.master.002.patch, 
> HBASE-18624.master.003.patch, HBASE-18624.master.004.patch, 
> HBASE-18624.master.005.patch, HBASE-18624.master.006.patch, 
> HBASE-18624.master.007.patch, HBASE-18624.master.008.patch
>
>
> Bulk loading the primary HBase cluster triggers a lot of compactions 
> resulting in archival/ creation
> of multiple HFiles. This process will cause a lot of items to become stale in 
> replica’s BlockCache.
> This patch will help users to clear the block cache for a given table by 
> either using shell or API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18912) Update Admin methods to return Lists instead of arrays

2017-10-31 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233588#comment-16233588
 ] 

Guanghao Zhang commented on HBASE-18912:


As I said in above comment, if there are no other good reason, I  thought we 
don't need to only change the return type from array to List. I will come back 
for this after resolve other sub-tasks of HBASE-18805. Thanks.

> Update Admin methods to return Lists instead of arrays
> --
>
> Key: HBASE-18912
> URL: https://issues.apache.org/jira/browse/HBASE-18912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
>Priority: Major
> Fix For: 2.0.0-beta-1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18950) Remove Optional parameters in AsyncAdmin interface

2017-10-31 Thread Guanghao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233582#comment-16233582
 ] 

Guanghao Zhang commented on HBASE-18950:


UT passed. The reason of checkstyle is error message too long. Do we have any 
format guide for this? 
Ping [~Apache9] [~appy] for reviewing.

> Remove Optional parameters in AsyncAdmin interface
> --
>
> Key: HBASE-18950
> URL: https://issues.apache.org/jira/browse/HBASE-18950
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Duo Zhang
>Assignee: Guanghao Zhang
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18950.master.001.patch, 
> HBASE-18950.master.002.patch, HBASE-18950.master.003.patch, 
> HBASE-18950.master.004.patch, HBASE-18950.master.005.patch, 
> HBASE-18950.master.005.patch, HBASE-18950.master.006.patch, 
> HBASE-18950.master.007.patch, HBASE-18950.master.008.patch, 
> HBASE-18950.master.008.patch, HBASE-18950.master.009.patch, 
> HBASE-18950.master.010.patch, HBASE-18950.master.011.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19054) Switch precommit docker image to one based on maven images

2017-10-31 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233568#comment-16233568
 ] 

Duo Zhang commented on HBASE-19054:
---

I think the magic is either the installation order, or something in yetus. +1 
on a separated issue to address it.

> Switch precommit docker image to one based on maven images
> --
>
> Key: HBASE-19054
> URL: https://issues.apache.org/jira/browse/HBASE-19054
> Project: HBase
>  Issue Type: Bug
>  Components: build, community
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.0.0-alpha-4
>
> Attachments: HBASE-19054.patch, HBASE-19054.v2.patch
>
>
> In HBASE-19042 we discuss moving to a maven-based image for docker instead of 
> going through gymnastics ourselves. We got a short term fix in there, but 
> let's do the bulk of the cleanup here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233562#comment-16233562
 ] 

Duo Zhang commented on HBASE-19114:
---

{quote}
It has all zk stuff, from both client and server. So even if client's stuff is 
gone, server's requirements will remain there.
{quote}
In general I would like hbase-client only depends on the stuffs which are only 
used by client, not server, so we may have two zk modules and after the 
purging, the module for client will only have one or two classes.

For hbase-common conflict, it is easier to address as user can just declare a 
3.0 dependency in the pom to force a 3.0 hbase-common on classpath. But for 
hbase-zk, you need to find out who introduce it and exclude it there, and maybe 
there are lots of them... Anyway, there are lots of maven magics so the problem 
can be addressed, no doubt. But I think we'd better not add more magics.

The zk stuffs in hbase-client is mainly used by sync client, the old plan is to 
replace the sync client implementation with async client in 3.0. Now if you 
really want the zk split in 2.0, then I think we could purge the zk stuff first 
for sync client in 2.0.

Thanks.

> Split out o.a.h.h.zookeeper from hbase-server and hbase-client
> --
>
> Key: HBASE-19114
> URL: https://issues.apache.org/jira/browse/HBASE-19114
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
>Priority: Major
> Attachments: HBASE-19114.master.001.patch, 
> HBASE-19114.master.002.patch, HBASE-19114.master.003.patch, 
> HBASE-19114.master.004.patch, HBASE-19114.master.005.patch
>
>
> Changes so far:
> - Moved DrainingServerTracker and RegionServerTracker to 
> hbase-server:o.a.h.h.master.
> - Move Abortable to hbase-common. Since it's IA.Private and independent of 
> anything, moving it to hbase-common which is at bottom of the dependency tree 
> is better.
> - Moved RecoveringRegionWatcher to hbase-server:o.a.h.h.regionserver
> - Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
> - Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module. We want to 
> keep hbase-zookeeper very independent and hence at lowest levels in our 
> dependency tree.
> - ZKUtil is a huge tangle since it's linked to almost everything in 
> \[hbase-client/]oahh.zookeeper. And pulling it down requires some basic proto 
> functions (mergeFrom, PBmagic, etc). So what i did was:
>** Pulled down common and basic protobuf functions (which only depend on 
> com.google.protobuf.\*) to hbase-common so other code depending on them can 
> be pulled down if possible/wanted in future. This will help future dependency 
> untangling too. These are ProtobufMagic and ProtobufHelpers.
>   ** Didn't move any hbase-specific PB stuff to hbase-common. We can't pull 
> things into hbase-common which add dependency between it and 
> hbase-protobuf/hbase-shaded-protobuf since we very recently untangled them.
> - DEFAULT_REPLICA_ID is used in many places in ZK. Declared a new constant in 
> HConstants (since it's in hbase-common) and using it in hbase-zookeeper. 
> RegionInfo.DEFAULT_REPLICA_ID also takes its value from it to avoid case 
> where two values can become different.
> - Renamed some classes to use a consistent naming for classes - ZK instead of 
> mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: 
> MiniZooKeeperCluster, ZooKeeperConnectionException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19054) Switch precommit docker image to one based on maven images

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233561#comment-16233561
 ] 

Mike Drob commented on HBASE-19054:
---

[~Apache9] - Oh, I thought there was also some magic to make sure that we were 
using the correct JDK even when they are both installed. I'll make a separate 
issue for the branch-1 changes and let nightly test them same as I did here.

> Switch precommit docker image to one based on maven images
> --
>
> Key: HBASE-19054
> URL: https://issues.apache.org/jira/browse/HBASE-19054
> Project: HBase
>  Issue Type: Bug
>  Components: build, community
>Reporter: Mike Drob
>Assignee: Mike Drob
>Priority: Major
> Fix For: 3.0.0, 2.0.0-alpha-4
>
> Attachments: HBASE-19054.patch, HBASE-19054.v2.patch
>
>
> In HBASE-19042 we discuss moving to a maven-based image for docker instead of 
> going through gymnastics ourselves. We got a short term fix in there, but 
> let's do the bulk of the cleanup here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19145) Look into hbase-2 client going to hbase-1 server

2017-10-31 Thread Jerry He (JIRA)
Jerry He created HBASE-19145:


 Summary: Look into hbase-2 client going to hbase-1 server
 Key: HBASE-19145
 URL: https://issues.apache.org/jira/browse/HBASE-19145
 Project: HBase
  Issue Type: Task
Affects Versions: 2.0.0-beta-1
Reporter: Jerry He
Priority: Major


>From the "[DISCUSS] hbase-2.0.0 compatibility expectations" thread.

Do we support hbase-2 client going against hbase-1 server?
We seem to be fine mix-and-match the clients and servers within the
hbase-1 releases.   And IIRC hbase-1 client is ok against 0.98 server.
Suppose I have a product  that depends and bundles HBase client. I
want to upgrade the dependency to hbase-2 so that it can take
advantages of and claim support of hbase-2.
But does it mean that I will need drop the claims that the new version
of the product support any hbase-1 backend?

It has not been an objective. It might work doing basic Client API on a
later branch-1 but will fail doing Admin functions (and figuring if a Table
is online).  If it was a small thing to make it
work, lets get it in.

Let's look into it to see what works and what not.  Have a statement at least.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233546#comment-16233546
 ] 

Appy commented on HBASE-19114:
--

But i'd still like to understand what was I missing!

bq. ...future hbase-client will only depend a small set of zk related code, so 
it make no sense to have a module that only have one class right
It has all zk stuff, from both client and server. So even if client's stuff is 
gone, server's requirements will remain there.


bq. But if in 2.0 you make hbase-client depend on an external hbase-zookeeper 
module, then it is not a good idea to remove the dependency in 3.0. It is very 
easy for a user to have a 3.0 hbase-client on the classpath and also a 2.0 
hbase-zookeeper on the classpath.
Why won't it be easy to remove the dep? How's that worse than having 
hbase-client 3.0 and hbase-common 2.0? 

> Split out o.a.h.h.zookeeper from hbase-server and hbase-client
> --
>
> Key: HBASE-19114
> URL: https://issues.apache.org/jira/browse/HBASE-19114
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
>Priority: Major
> Attachments: HBASE-19114.master.001.patch, 
> HBASE-19114.master.002.patch, HBASE-19114.master.003.patch, 
> HBASE-19114.master.004.patch, HBASE-19114.master.005.patch
>
>
> Changes so far:
> - Moved DrainingServerTracker and RegionServerTracker to 
> hbase-server:o.a.h.h.master.
> - Move Abortable to hbase-common. Since it's IA.Private and independent of 
> anything, moving it to hbase-common which is at bottom of the dependency tree 
> is better.
> - Moved RecoveringRegionWatcher to hbase-server:o.a.h.h.regionserver
> - Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
> - Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module. We want to 
> keep hbase-zookeeper very independent and hence at lowest levels in our 
> dependency tree.
> - ZKUtil is a huge tangle since it's linked to almost everything in 
> \[hbase-client/]oahh.zookeeper. And pulling it down requires some basic proto 
> functions (mergeFrom, PBmagic, etc). So what i did was:
>** Pulled down common and basic protobuf functions (which only depend on 
> com.google.protobuf.\*) to hbase-common so other code depending on them can 
> be pulled down if possible/wanted in future. This will help future dependency 
> untangling too. These are ProtobufMagic and ProtobufHelpers.
>   ** Didn't move any hbase-specific PB stuff to hbase-common. We can't pull 
> things into hbase-common which add dependency between it and 
> hbase-protobuf/hbase-shaded-protobuf since we very recently untangled them.
> - DEFAULT_REPLICA_ID is used in many places in ZK. Declared a new constant in 
> HConstants (since it's in hbase-common) and using it in hbase-zookeeper. 
> RegionInfo.DEFAULT_REPLICA_ID also takes its value from it to avoid case 
> where two values can become different.
> - Renamed some classes to use a consistent naming for classes - ZK instead of 
> mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: 
> MiniZooKeeperCluster, ZooKeeperConnectionException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19103) Add BigDecimalComparator for filter

2017-10-31 Thread Qilin Cao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233542#comment-16233542
 ] 

Qilin Cao commented on HBASE-19103:
---

[~tedyu][~Jan Hentschel] Thanks for your suggestion! I should optimize it.

> Add BigDecimalComparator for filter
> ---
>
> Key: HBASE-19103
> URL: https://issues.apache.org/jira/browse/HBASE-19103
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Reporter: Qilin Cao
>Assignee: Qilin Cao
>Priority: Minor
> Fix For: 1.2.0, 3.0.0
>
> Attachments: HBASE-19103-1.2.0-v1.patch, HBASE-19103-1.2.0-v2.patch, 
> HBASE-19103-trunk-v1.patch, HBASE-19103-trunk-v2.patch
>
>
> Should I add BigDecimalComparator for filter? Some scenarios need to 
> calculate the data accurately may use it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233530#comment-16233530
 ] 

Appy commented on HBASE-19114:
--

I don't completely understand it, but your last comment suggests that this 
change can hurt your future effort, so let me do as you say - not have zk stuff 
required by hbase-client in this new module. Is that correct sir?


> Split out o.a.h.h.zookeeper from hbase-server and hbase-client
> --
>
> Key: HBASE-19114
> URL: https://issues.apache.org/jira/browse/HBASE-19114
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
>Priority: Major
> Attachments: HBASE-19114.master.001.patch, 
> HBASE-19114.master.002.patch, HBASE-19114.master.003.patch, 
> HBASE-19114.master.004.patch, HBASE-19114.master.005.patch
>
>
> Changes so far:
> - Moved DrainingServerTracker and RegionServerTracker to 
> hbase-server:o.a.h.h.master.
> - Move Abortable to hbase-common. Since it's IA.Private and independent of 
> anything, moving it to hbase-common which is at bottom of the dependency tree 
> is better.
> - Moved RecoveringRegionWatcher to hbase-server:o.a.h.h.regionserver
> - Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
> - Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module. We want to 
> keep hbase-zookeeper very independent and hence at lowest levels in our 
> dependency tree.
> - ZKUtil is a huge tangle since it's linked to almost everything in 
> \[hbase-client/]oahh.zookeeper. And pulling it down requires some basic proto 
> functions (mergeFrom, PBmagic, etc). So what i did was:
>** Pulled down common and basic protobuf functions (which only depend on 
> com.google.protobuf.\*) to hbase-common so other code depending on them can 
> be pulled down if possible/wanted in future. This will help future dependency 
> untangling too. These are ProtobufMagic and ProtobufHelpers.
>   ** Didn't move any hbase-specific PB stuff to hbase-common. We can't pull 
> things into hbase-common which add dependency between it and 
> hbase-protobuf/hbase-shaded-protobuf since we very recently untangled them.
> - DEFAULT_REPLICA_ID is used in many places in ZK. Declared a new constant in 
> HConstants (since it's in hbase-common) and using it in hbase-zookeeper. 
> RegionInfo.DEFAULT_REPLICA_ID also takes its value from it to avoid case 
> where two values can become different.
> - Renamed some classes to use a consistent naming for classes - ZK instead of 
> mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: 
> MiniZooKeeperCluster, ZooKeeperConnectionException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19118) Use SaslUtil to set Sasl.QOP in 'Thrift'

2017-10-31 Thread Reid Chan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16232725#comment-16232725
 ] 

Reid Chan commented on HBASE-19118:
---

any suggestions? [~elserj]

> Use SaslUtil to set Sasl.QOP in 'Thrift'
> 
>
> Key: HBASE-19118
> URL: https://issues.apache.org/jira/browse/HBASE-19118
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Affects Versions: 1.2.6
>Reporter: Reid Chan
>Assignee: Reid Chan
>Priority: Major
> Attachments: HBASE-19118.master.001.patch, 
> HBASE-19118.master.002.patch, HBASE-19118.master.003.patch
>
>
> In [Configure the Thrift 
> Gateway|http://hbase.apache.org/book.html#security.client.thrift], it says 
> "set the property {{hbase.thrift.security.qop}} to one of the following three 
> values: {{privacy}}, {{integrity}}, {{authentication}}", which would lead to 
> failure of starting up a thrift server.
> In fact, the value of {{hbase.thrift.security.qop}} should be {{auth}}, 
> {{auth-int}}, {{auth-conf}}, according to the documentation of {{Sasl.QOP}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16232593#comment-16232593
 ] 

Duo Zhang commented on HBASE-19114:
---

Then if someone depends netty-all while others depend the individual netty 
modules, it will be totally a mess. That's one of the reasons why we want to 
shade netty to prevent breaks downstream users, and also prevent us being 
broken by hadoop.

Can we delay this for a while? Is it necessary for 2.0 release? As I said 
above, in the future hbase-client will only depend a small set of zk related 
code, so it make no sense to have a module that only have one class right? But 
if in 2.0 you make hbase-client depend on an external hbase-zookeeper module, 
then it is not a good idea to remove the dependency in 3.0. It is very easy for 
a user to have a 3.0 hbase-client on the classpath and also a 2.0 
hbase-zookeeper on the classpath. It is annoying.

For me, I would like this:
1. Move ZNodePaths to hbase-common, and also a very very small ZKUtil which 
only have a very very small set of helper methods, such as removing the prefix. 
hbase-common will not depend on zk.
2. hbase-client will implement the zk logic by its own with curator, no 
external dependency.
3. Introduce a hbase-zookeeper module which implements all the zk related logic 
for hbase-server, and make hbase-server depend on it to make hbase-server less 
complex.

Thanks.

> Split out o.a.h.h.zookeeper from hbase-server and hbase-client
> --
>
> Key: HBASE-19114
> URL: https://issues.apache.org/jira/browse/HBASE-19114
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
>Priority: Major
> Attachments: HBASE-19114.master.001.patch, 
> HBASE-19114.master.002.patch, HBASE-19114.master.003.patch, 
> HBASE-19114.master.004.patch, HBASE-19114.master.005.patch
>
>
> Changes so far:
> - Moved DrainingServerTracker and RegionServerTracker to 
> hbase-server:o.a.h.h.master.
> - Move Abortable to hbase-common. Since it's IA.Private and independent of 
> anything, moving it to hbase-common which is at bottom of the dependency tree 
> is better.
> - Moved RecoveringRegionWatcher to hbase-server:o.a.h.h.regionserver
> - Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
> - Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module. We want to 
> keep hbase-zookeeper very independent and hence at lowest levels in our 
> dependency tree.
> - ZKUtil is a huge tangle since it's linked to almost everything in 
> \[hbase-client/]oahh.zookeeper. And pulling it down requires some basic proto 
> functions (mergeFrom, PBmagic, etc). So what i did was:
>** Pulled down common and basic protobuf functions (which only depend on 
> com.google.protobuf.\*) to hbase-common so other code depending on them can 
> be pulled down if possible/wanted in future. This will help future dependency 
> untangling too. These are ProtobufMagic and ProtobufHelpers.
>   ** Didn't move any hbase-specific PB stuff to hbase-common. We can't pull 
> things into hbase-common which add dependency between it and 
> hbase-protobuf/hbase-shaded-protobuf since we very recently untangled them.
> - DEFAULT_REPLICA_ID is used in many places in ZK. Declared a new constant in 
> HConstants (since it's in hbase-common) and using it in hbase-zookeeper. 
> RegionInfo.DEFAULT_REPLICA_ID also takes its value from it to avoid case 
> where two values can become different.
> - Renamed some classes to use a consistent naming for classes - ZK instead of 
> mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: 
> MiniZooKeeperCluster, ZooKeeperConnectionException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18624) Added support for clearing BlockCache based on table name

2017-10-31 Thread Zach York (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-18624:
--
Attachment: HBASE-18624.master.008.patch

> Added support for clearing BlockCache based on table name
> -
>
> Key: HBASE-18624
> URL: https://issues.apache.org/jira/browse/HBASE-18624
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Ajay Jadhav
>Assignee: Zach York
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-18624.branch-1.001.patch, 
> HBASE-18624.master.001.patch, HBASE-18624.master.002.patch, 
> HBASE-18624.master.003.patch, HBASE-18624.master.004.patch, 
> HBASE-18624.master.005.patch, HBASE-18624.master.006.patch, 
> HBASE-18624.master.007.patch, HBASE-18624.master.008.patch
>
>
> Bulk loading the primary HBase cluster triggers a lot of compactions 
> resulting in archival/ creation
> of multiple HFiles. This process will cause a lot of items to become stale in 
> replica’s BlockCache.
> This patch will help users to clear the block cache for a given table by 
> either using shell or API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227804#comment-16227804
 ] 

Appy commented on HBASE-19114:
--

bq. I do not think people will like multi modules, for example, we always 
depend on netty-all instead of the individual netty modules right? The issue 
will also effect hbase-client so I’m a little nervous. I haven’t stopped the 
http module separation since it is server only

Indeed! That is a very valid point [~Apache9]!
Made me think, does netty really build everything as one module!? That would be 
one crazy codebase to maintain.
Looked for io.netty on central mvn repo 
(https://mvnrepository.com/artifact/io.netty), found ~40 artifacts associated 
with it. So probably netty-all is made by merging all of them together for 
convenience of users. It'd have been nightmare if we had to choose and pick 
individual jar ourselves, netty-all does makes like much simpler!

Am thinking if we can do similar - publish a hbase-all. That'd prevent 
breakages in downstream project like the recent ones in pig, hive, crunch, 
kite, etc when we split out hbase-mapreduce.
Admittedly, it'll be one big jar, but if it makes life of downstream projects 
easy, won't they really like it?

> Split out o.a.h.h.zookeeper from hbase-server and hbase-client
> --
>
> Key: HBASE-19114
> URL: https://issues.apache.org/jira/browse/HBASE-19114
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19114.master.001.patch, 
> HBASE-19114.master.002.patch, HBASE-19114.master.003.patch, 
> HBASE-19114.master.004.patch, HBASE-19114.master.005.patch
>
>
> Changes so far:
> - Moved DrainingServerTracker and RegionServerTracker to 
> hbase-server:o.a.h.h.master.
> - Move Abortable to hbase-common. Since it's IA.Private and independent of 
> anything, moving it to hbase-common which is at bottom of the dependency tree 
> is better.
> - Moved RecoveringRegionWatcher to hbase-server:o.a.h.h.regionserver
> - Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
> - Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module. We want to 
> keep hbase-zookeeper very independent and hence at lowest levels in our 
> dependency tree.
> - ZKUtil is a huge tangle since it's linked to almost everything in 
> \[hbase-client/]oahh.zookeeper. And pulling it down requires some basic proto 
> functions (mergeFrom, PBmagic, etc). So what i did was:
>** Pulled down common and basic protobuf functions (which only depend on 
> com.google.protobuf.\*) to hbase-common so other code depending on them can 
> be pulled down if possible/wanted in future. This will help future dependency 
> untangling too. These are ProtobufMagic and ProtobufHelpers.
>   ** Didn't move any hbase-specific PB stuff to hbase-common. We can't pull 
> things into hbase-common which add dependency between it and 
> hbase-protobuf/hbase-shaded-protobuf since we very recently untangled them.
> - DEFAULT_REPLICA_ID is used in many places in ZK. Declared a new constant in 
> HConstants (since it's in hbase-common) and using it in hbase-zookeeper. 
> RegionInfo.DEFAULT_REPLICA_ID also takes its value from it to avoid case 
> where two values can become different.
> - Renamed some classes to use a consistent naming for classes - ZK instead of 
> mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: 
> MiniZooKeeperCluster, ZooKeeperConnectionException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12091:
--
Attachment: 12091-v2-branch-1.txt

Added a test.

Note that the currently fails.
Also note that I added two constructors to RetriesExhaustedWithDetailsException 
so that it can be instantiated from a RemoteException.

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 1.4.0
>
> Attachments: 12091-v2-branch-1.txt, 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19141) [compat 1-2] getClusterStatus always return empty ClusterStatus

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19141:
---
Status: Patch Available  (was: Open)

>  [compat 1-2] getClusterStatus always return empty ClusterStatus
> 
>
> Key: HBASE-19141
> URL: https://issues.apache.org/jira/browse/HBASE-19141
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19141.v0.patch, HBASE-19141.v1.patch
>
>
> We are able to limit the scope to get part of {{ClusterStatus}} in 2.0. 
> However the request sent by 1.x client has no specific scope info to retrieve 
> any information from {{ClusterStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19141) [compat 1-2] getClusterStatus always return empty ClusterStatus

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19141:
---
Attachment: HBASE-19141.v1.patch

>  [compat 1-2] getClusterStatus always return empty ClusterStatus
> 
>
> Key: HBASE-19141
> URL: https://issues.apache.org/jira/browse/HBASE-19141
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19141.v0.patch, HBASE-19141.v1.patch
>
>
> We are able to limit the scope to get part of {{ClusterStatus}} in 2.0. 
> However the request sent by 1.x client has no specific scope info to retrieve 
> any information from {{ClusterStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19141) [compat 1-2] getClusterStatus always return empty ClusterStatus

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19141:
---
Status: Open  (was: Patch Available)

>  [compat 1-2] getClusterStatus always return empty ClusterStatus
> 
>
> Key: HBASE-19141
> URL: https://issues.apache.org/jira/browse/HBASE-19141
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19141.v0.patch
>
>
> We are able to limit the scope to get part of {{ClusterStatus}} in 2.0. 
> However the request sent by 1.x client has no specific scope info to retrieve 
> any information from {{ClusterStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-17706) TableSkewCostFunction improperly computes max skew

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-17706:
---
  Resolution: Duplicate
Hadoop Flags:   (was: Reviewed)
  Status: Resolved  (was: Patch Available)

> TableSkewCostFunction improperly computes max skew
> --
>
> Key: HBASE-17706
> URL: https://issues.apache.org/jira/browse/HBASE-17706
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>  Labels: patch
> Fix For: 2.0.0
>
> Attachments: HBASE-17706-01.patch, HBASE-17706-02.patch, 
> HBASE-17706-03.patch, HBASE-17706-04.patch, HBASE-17706-05.patch, 
> HBASE-17706-06.patch, HBASE-17706-07.patch, HBASE-17706.patch
>
>
> We noticed while running unit tests that the TableSkewCostFunction computed 
> cost did not change as the balancer ran and simulated moves across the 
> cluster. After investigating, we found that this happened in particular when 
> the cluster started out with at least one table very strongly skewed.
> We noticed that the TableSkewCostFunction depends on a field of the 
> BaseLoadBalancer.Cluster class called numMaxRegionsPerTable, but this field 
> is not properly maintained as regionMoves are simulated for the cluster. The 
> field only ever increases as the maximum number of regions per table 
> increases, but it does not decrease as the maximum number per table goes down.
> This patch corrects that behavior so that the field is accurately maintained, 
> and thus the TableSkewCostFunction produces a more correct value as the 
> balancer runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17706) TableSkewCostFunction improperly computes max skew

2017-10-31 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227787#comment-16227787
 ] 

Chia-Ping Tsai commented on HBASE-17706:


I had already fixed it in HBASE-18480. Will close this jira as duplicate.

> TableSkewCostFunction improperly computes max skew
> --
>
> Key: HBASE-17706
> URL: https://issues.apache.org/jira/browse/HBASE-17706
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>  Labels: patch
> Fix For: 2.0.0
>
> Attachments: HBASE-17706-01.patch, HBASE-17706-02.patch, 
> HBASE-17706-03.patch, HBASE-17706-04.patch, HBASE-17706-05.patch, 
> HBASE-17706-06.patch, HBASE-17706-07.patch, HBASE-17706.patch
>
>
> We noticed while running unit tests that the TableSkewCostFunction computed 
> cost did not change as the balancer ran and simulated moves across the 
> cluster. After investigating, we found that this happened in particular when 
> the cluster started out with at least one table very strongly skewed.
> We noticed that the TableSkewCostFunction depends on a field of the 
> BaseLoadBalancer.Cluster class called numMaxRegionsPerTable, but this field 
> is not properly maintained as regionMoves are simulated for the cluster. The 
> field only ever increases as the maximum number of regions per table 
> increases, but it does not decrease as the maximum number per table goes down.
> This patch corrects that behavior so that the field is accurately maintained, 
> and thus the TableSkewCostFunction produces a more correct value as the 
> balancer runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17706) TableSkewCostFunction improperly computes max skew

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227783#comment-16227783
 ] 

stack commented on HBASE-17706:
---

Should we up the priority on this [~chia7712]? Pull it into beta-1?

> TableSkewCostFunction improperly computes max skew
> --
>
> Key: HBASE-17706
> URL: https://issues.apache.org/jira/browse/HBASE-17706
> Project: HBase
>  Issue Type: Bug
>  Components: Balancer
>Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43 
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
>Reporter: Kahlil Oppenheimer
>Assignee: Kahlil Oppenheimer
>  Labels: patch
> Fix For: 2.0.0
>
> Attachments: HBASE-17706-01.patch, HBASE-17706-02.patch, 
> HBASE-17706-03.patch, HBASE-17706-04.patch, HBASE-17706-05.patch, 
> HBASE-17706-06.patch, HBASE-17706-07.patch, HBASE-17706.patch
>
>
> We noticed while running unit tests that the TableSkewCostFunction computed 
> cost did not change as the balancer ran and simulated moves across the 
> cluster. After investigating, we found that this happened in particular when 
> the cluster started out with at least one table very strongly skewed.
> We noticed that the TableSkewCostFunction depends on a field of the 
> BaseLoadBalancer.Cluster class called numMaxRegionsPerTable, but this field 
> is not properly maintained as regionMoves are simulated for the cluster. The 
> field only ever increases as the maximum number of regions per table 
> increases, but it does not decrease as the maximum number per table goes down.
> This patch corrects that behavior so that the field is accurately maintained, 
> and thus the TableSkewCostFunction produces a more correct value as the 
> balancer runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19138) Rare failure in TestLruBlockCache

2017-10-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227778#comment-16227778
 ] 

Hudson commented on HBASE-19138:


SUCCESS: Integrated in Jenkins build HBase-1.5 #125 (See 
[https://builds.apache.org/job/HBase-1.5/125/])
HBASE-19138 Rare failure in TestLruBlockCache (apurtell: rev 
0ac5b337474a96a2e7a73fd2ab22bd25e852df12)
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestLruBlockCache.java


> Rare failure in TestLruBlockCache
> -
>
> Key: HBASE-19138
> URL: https://issues.apache.org/jira/browse/HBASE-19138
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>
> Attachments: HBASE-19138.patch
>
>
> [ERROR] 
> testCacheEvictionThreadSafe(org.apache.hadoop.hbase.io.hfile.TestLruBlockCache)
>  Time elapsed: 0.028 s <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<1>
> at 
> org.apache.hadoop.hbase.io.hfile.TestLruBlockCache.testCacheEvictionThreadSafe(TestLruBlockCache.java:86)
> Fix for this I think is waiting for the block count to drop to zero after 
> awaiting shutdown of the executor pool. There's a rare race here. Tests look 
> good using a Waiter predicate so far, but I haven't finished all 100 
> iterations yet. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18693) adding an option to restore_snapshot to move mob files from archive dir to working dir

2017-10-31 Thread huaxiang sun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huaxiang sun updated HBASE-18693:
-
Attachment: HBASE-18693.master.003.patch

Upload patch v3 to address TestShell failure. The other failed two test cases, 
I run locally and they passed.

> adding an option to restore_snapshot to move mob files from archive dir to 
> working dir
> --
>
> Key: HBASE-18693
> URL: https://issues.apache.org/jira/browse/HBASE-18693
> Project: HBase
>  Issue Type: Improvement
>  Components: mob
>Affects Versions: 2.0.0-alpha-2
>Reporter: huaxiang sun
>Assignee: huaxiang sun
> Attachments: HBASE-18693.master.001.patch, 
> HBASE-18693.master.002.patch, HBASE-18693.master.003.patch
>
>
> Today, there is a single mob region where mob files for all user regions are 
> saved. There could be many files (one million) in a single mob directory. 
> When one mob table is restored or cloned from snapshot, links are created for 
> these mob files. This creates a scaling issue for mob compaction. In mob 
> compaction's select() logic, for each hFileLink, it needs to call NN's 
> getFileStatus() to get the size of the linked hfile. Assume that one such 
> call takes 20ms, 20ms * 100 = 6 hours. 
> To avoid this overhead, we want to add an option so that restore_snapshot can 
> move mob files from archive dir to working dir. clone_snapshot is more 
> complicated as it can clone a snapshot to a different table so moving that 
> can destroy the snapshot. No option will be added for clone_snapshot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19131) Add the ClusterStatus hook and cleanup other hooks which can be replaced by ClusterStatus hook

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19131:
---
Status: Patch Available  (was: Open)

> Add the ClusterStatus hook and cleanup other hooks which can be replaced by 
> ClusterStatus hook
> --
>
> Key: HBASE-19131
> URL: https://issues.apache.org/jira/browse/HBASE-19131
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19131.v0.patch
>
>
> This issue will add the hook for {{ClusterStatus}} and cleanup other hooks 
> which can be replaced by {{ClusterStatus}} hook.
> # {{ListDeadServers}} -  introduced by HBASE-18131



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19131) Add the ClusterStatus hook and cleanup other hooks which can be replaced by ClusterStatus hook

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19131:
---
Attachment: HBASE-19131.v0.patch

> Add the ClusterStatus hook and cleanup other hooks which can be replaced by 
> ClusterStatus hook
> --
>
> Key: HBASE-19131
> URL: https://issues.apache.org/jira/browse/HBASE-19131
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19131.v0.patch
>
>
> This issue will add the hook for {{ClusterStatus}} and cleanup other hooks 
> which can be replaced by {{ClusterStatus}} hook.
> # {{ListDeadServers}} -  introduced by HBASE-18131



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-19135) TestWeakObjectPool time out

2017-10-31 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-19135.
---
   Resolution: Fixed
 Assignee: stack
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.0-beta-1

Pushed to master and branch-2. Thanks for reviews. (Tried to backport to 
branch-1 but failed apply)

> TestWeakObjectPool time out
> ---
>
> Key: HBASE-19135
> URL: https://issues.apache.org/jira/browse/HBASE-19135
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19135.master.001.patch
>
>
> On last branch-2 build, this small test timed out. Timeout is 1second which 
> given how hostile apache infra has been of late is way to short. Change the 
> test to do category-base timeout.
> Here is the report:
> {code}
> Regression
> org.apache.hadoop.hbase.util.TestWeakObjectPool.testCongestion
> Failing for the past 1 build (Since Failed#778 )
> Took 1.1 sec.
> add description
> Error Message
> test timed out after 1000 milliseconds
> Stacktrace
> org.junit.runners.model.TestTimedOutException: test timed out after 1000 
> milliseconds
>   at 
> org.apache.hadoop.hbase.util.TestWeakObjectPool.testCongestion(TestWeakObjectPool.java:128)
> Standard Output
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19135) TestWeakObjectPool time out

2017-10-31 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227771#comment-16227771
 ] 

Umesh Agashe commented on HBASE-19135:
--

[~stack], sounds good. +1

> TestWeakObjectPool time out
> ---
>
> Key: HBASE-19135
> URL: https://issues.apache.org/jira/browse/HBASE-19135
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
> Attachments: HBASE-19135.master.001.patch
>
>
> On last branch-2 build, this small test timed out. Timeout is 1second which 
> given how hostile apache infra has been of late is way to short. Change the 
> test to do category-base timeout.
> Here is the report:
> {code}
> Regression
> org.apache.hadoop.hbase.util.TestWeakObjectPool.testCongestion
> Failing for the past 1 build (Since Failed#778 )
> Took 1.1 sec.
> add description
> Error Message
> test timed out after 1000 milliseconds
> Stacktrace
> org.junit.runners.model.TestTimedOutException: test timed out after 1000 
> milliseconds
>   at 
> org.apache.hadoop.hbase.util.TestWeakObjectPool.testCongestion(TestWeakObjectPool.java:128)
> Standard Output
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19135) TestWeakObjectPool time out

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227769#comment-16227769
 ] 

stack commented on HBASE-19135:
---

[~uagashe] Ok I commit this as is and we can look at doing something more 
generic in another issue? Thanks.

> TestWeakObjectPool time out
> ---
>
> Key: HBASE-19135
> URL: https://issues.apache.org/jira/browse/HBASE-19135
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
> Attachments: HBASE-19135.master.001.patch
>
>
> On last branch-2 build, this small test timed out. Timeout is 1second which 
> given how hostile apache infra has been of late is way to short. Change the 
> test to do category-base timeout.
> Here is the report:
> {code}
> Regression
> org.apache.hadoop.hbase.util.TestWeakObjectPool.testCongestion
> Failing for the past 1 build (Since Failed#778 )
> Took 1.1 sec.
> add description
> Error Message
> test timed out after 1000 milliseconds
> Stacktrace
> org.junit.runners.model.TestTimedOutException: test timed out after 1000 
> milliseconds
>   at 
> org.apache.hadoop.hbase.util.TestWeakObjectPool.testCongestion(TestWeakObjectPool.java:128)
> Standard Output
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19131) Add the ClusterStatus hook and cleanup other hooks which can be replaced by ClusterStatus hook

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19131:
---
Release Note: 
1) Add preGetClusterStatus() and postGetClusterStatus() hooks
2) add preGetClusterStatus() to access control check - an admin action

> Add the ClusterStatus hook and cleanup other hooks which can be replaced by 
> ClusterStatus hook
> --
>
> Key: HBASE-19131
> URL: https://issues.apache.org/jira/browse/HBASE-19131
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
>
> This issue will add the hook for {{ClusterStatus}} and cleanup other hooks 
> which can be replaced by {{ClusterStatus}} hook.
> # {{ListDeadServers}} -  introduced by HBASE-18131



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-13428) Migration to hbase-2.0.0

2017-10-31 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13428:
--
Release Note: * Make sure Distributed Log Replay is not enabled on your 
hbase1 cluster; it was problematic in hbase1 and does not work at all in hbase2.

> Migration to hbase-2.0.0
> 
>
> Key: HBASE-13428
> URL: https://issues.apache.org/jira/browse/HBASE-13428
> Project: HBase
>  Issue Type: Umbrella
>  Components: migration
>Reporter: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
>
> Opening a 2.0 umbrella migration issue. Lets hang off this one any tools and 
> expectations migrating from 1.0 (or earlier) to 2.0. So far there are none 
> that I know of though there is an expectation in HBASE-13373 that hfiles are 
> at least major version 2 and minor version 3.  Lets list all such 
> expectations, etc., here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19128) Purge Distributed Log Replay from codebase, configurations, text; mark the feature as unsupported, broken.

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227764#comment-16227764
 ] 

stack commented on HBASE-19128:
---

There is nothing to disable when it comes to DLR (it never worked). I do like 
your suggestion that we flag DLR in release notes on hbase2.0. Do it here? I 
added note to the migration doc.

> Purge Distributed Log Replay from codebase, configurations, text; mark the 
> feature as unsupported, broken.
> --
>
> Key: HBASE-19128
> URL: https://issues.apache.org/jira/browse/HBASE-19128
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: stack
>  Labels: incompatible
> Fix For: 2.0.0
>
>
> Kill it. It keeps coming up and over again. Needs proper burial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19121) HBCK for AMv2

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227762#comment-16227762
 ] 

stack commented on HBASE-19121:
---

HBASE-14620 is a bit wrong-headed in my opinion but it had some good stuff in 
it HBASE-18792 is a subset of this.

HBCK in AMv2 world needs to ask Master to do any edits. If no Master, needs to 
start up one or a subset of Master to run the operations on its behalf.

> HBCK for AMv2
> -
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Sub-task
>  Components: hbck
>Reporter: stack
> Fix For: 2.0.0-beta-1
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18925) Need updated mockito for using java optional

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227759#comment-16227759
 ] 

stack commented on HBASE-18925:
---

I +1'd this already.

> Need updated mockito for using java optional
> 
>
> Key: HBASE-18925
> URL: https://issues.apache.org/jira/browse/HBASE-18925
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18925.master.001.patch, 
> HBASE-18925.master.002.patch, HBASE-18925.master.002.patch, 
> HBASE-18925.master.003.patch, HBASE-18925.master.004.patch, 
> HBASE-18925.master.005.patch, HBASE-18925.master.006.patch, 
> HBASE-18925.master.007.patch, HBASE-18925.master.008.patch, 
> HBASE-18925.master.009.patch
>
>
> Came up when i was trying to test HBASE-18878.
> It kept failing because mock of RpcCall returned null where return type was 
> Optional.
> Instead, we want it to return Optional.empty(). 
> New mockito versions support this (and other java8 things) - 
> https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2
> We use mockito-all which was last released in Dec2014. However, mockito-core 
> has had more than 50 releases after that 
> (https://mvnrepository.com/artifact/org.mockito/mockito-core). 
> We need to change our deps from mockito-all to mockito-core.
> However that comes with fair breakages, so this is not a simple task of 
> changing pom files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18961) doMiniBatchMutate() is big, split it into smaller methods

2017-10-31 Thread Umesh Agashe (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227749#comment-16227749
 ] 

Umesh Agashe commented on HBASE-18961:
--

Feedback from review session with [~stack]:
# # Comment that batchMutate() and batchReplay() share most of the code and are 
related. Same for classes ReplayBatchOperation and MutationBatchOperation
# Release note the intent: Trying to unify batchMutate() and mutateRows() calls 
(2 write paths)
# Existing bug: add TODO for Validation with timestamp before locking and 
update with new timestamp after lock inconsistency
# Add a comment that once row is locked op is considered as In-progress for 
method lockRowsAndBuildMiniBatch()
# Add Javadoc with: Mini-batch is window over batch and its sequential
# change updateMiniBatchOperations to prepareMiniBatchOperations()
# Comment on buildWALEdit() in super class has replay in it which has separate 
subclass, move it and add more comments
# Move all replay/ not replay checks out of doMiniBatchMutate()
# [~stack] asked about adding more unit tests but can be done in HBASE-18962 as 
it will have new functionality.


> doMiniBatchMutate() is big, split it into smaller methods
> -
>
> Key: HBASE-18961
> URL: https://issues.apache.org/jira/browse/HBASE-18961
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-3
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
> Fix For: 2.0.0-beta-1
>
> Attachments: hbase-18961.master.001.patch, 
> hbase-18961.master.002.patch
>
>
> Split doMiniBatchMutate() and improve readability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227748#comment-16227748
 ] 

Duo Zhang commented on HBASE-19114:
---

I do not think people will like multi modules, for example, we always depend on 
netty-all instead of the individual netty modules right? The issue will also 
effect hbase-client so I’m a little nervous. I haven’t stopped the http module 
separation since it is server only :)

For the long pre commit, I would say that, we have already hit the problems 
several times, that a hbase-client change breaks hbase-server, or a 
hbase-server change breaks hbase-spark, so I suggest that yetus should analyze 
the dependency between modules and run all the UTs for the modules on the 
dependency tree.

> Split out o.a.h.h.zookeeper from hbase-server and hbase-client
> --
>
> Key: HBASE-19114
> URL: https://issues.apache.org/jira/browse/HBASE-19114
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Appy
>Assignee: Appy
> Attachments: HBASE-19114.master.001.patch, 
> HBASE-19114.master.002.patch, HBASE-19114.master.003.patch, 
> HBASE-19114.master.004.patch, HBASE-19114.master.005.patch
>
>
> Changes so far:
> - Moved DrainingServerTracker and RegionServerTracker to 
> hbase-server:o.a.h.h.master.
> - Move Abortable to hbase-common. Since it's IA.Private and independent of 
> anything, moving it to hbase-common which is at bottom of the dependency tree 
> is better.
> - Moved RecoveringRegionWatcher to hbase-server:o.a.h.h.regionserver
> - Moved SplitOrMergeTracker to oahh.master (because it depends on a PB)
> - Moving hbase-client:oahh.zookeeper.*  to hbase-zookeeper module. We want to 
> keep hbase-zookeeper very independent and hence at lowest levels in our 
> dependency tree.
> - ZKUtil is a huge tangle since it's linked to almost everything in 
> \[hbase-client/]oahh.zookeeper. And pulling it down requires some basic proto 
> functions (mergeFrom, PBmagic, etc). So what i did was:
>** Pulled down common and basic protobuf functions (which only depend on 
> com.google.protobuf.\*) to hbase-common so other code depending on them can 
> be pulled down if possible/wanted in future. This will help future dependency 
> untangling too. These are ProtobufMagic and ProtobufHelpers.
>   ** Didn't move any hbase-specific PB stuff to hbase-common. We can't pull 
> things into hbase-common which add dependency between it and 
> hbase-protobuf/hbase-shaded-protobuf since we very recently untangled them.
> - DEFAULT_REPLICA_ID is used in many places in ZK. Declared a new constant in 
> HConstants (since it's in hbase-common) and using it in hbase-zookeeper. 
> RegionInfo.DEFAULT_REPLICA_ID also takes its value from it to avoid case 
> where two values can become different.
> - Renamed some classes to use a consistent naming for classes - ZK instead of 
> mix of ZK, Zk , ZooKeeper. Couldn't rename following public classes: 
> MiniZooKeeperCluster, ZooKeeperConnectionException.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18961) doMiniBatchMutate() is big, split it into smaller methods

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227738#comment-16227738
 ] 

stack edited comment on HBASE-18961 at 10/31/17 10:52 PM:
--

[~uagashe] and I just did a review session on the patch up in RB. Helped me 
understand on what is going on. We came up w/ a few things to fix in this 
version but should be good to go then.


was (Author: stack):
Anoop and I just did a review session on the patch up in RB. Helped me 
understand on what is going on. We came up w/ a few things to fix in this 
version but should be good to go then.

> doMiniBatchMutate() is big, split it into smaller methods
> -
>
> Key: HBASE-18961
> URL: https://issues.apache.org/jira/browse/HBASE-18961
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-3
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
> Fix For: 2.0.0-beta-1
>
> Attachments: hbase-18961.master.001.patch, 
> hbase-18961.master.002.patch
>
>
> Split doMiniBatchMutate() and improve readability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227740#comment-16227740
 ] 

Andrew Purtell commented on HBASE-12091:


bq. It seems the entire replication logic needs a review in 1.4 (and probably 
before in 1.3 as well).
We can come up with a test plan for that.

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 1.4.0
>
> Attachments: 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18961) doMiniBatchMutate() is big, split it into smaller methods

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227738#comment-16227738
 ] 

stack commented on HBASE-18961:
---

Anoop and I just did a review session on the patch up in RB. Helped me 
understand on what is going on. We came up w/ a few things to fix in this 
version but should be good to go then.

> doMiniBatchMutate() is big, split it into smaller methods
> -
>
> Key: HBASE-18961
> URL: https://issues.apache.org/jira/browse/HBASE-18961
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-3
>Reporter: Umesh Agashe
>Assignee: Umesh Agashe
> Fix For: 2.0.0-beta-1
>
> Attachments: hbase-18961.master.001.patch, 
> hbase-18961.master.002.patch
>
>
> Split doMiniBatchMutate() and improve readability.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227729#comment-16227729
 ] 

Andrew Purtell edited comment on HBASE-19144 at 10/31/17 10:46 PM:
---

[~stack] reported the same thing with branch-2 on HBASE-17580.
 
Seems the decision we made to make FAILED_OPEN state something an operator has 
to resolve by hand, or by hbck (someday), or by master failover, is a nasty 
operational wrinkle when using RSgroups, because the possibility of all servers 
in a RSGroup going down is higher than for the whole cluster.

I wonder if we can more generally consider kicking FAILED_OPEN state 
assignments whenever a new server shows up, with hysteresis such that if a few 
servers show up in a short period of time we don't pile assignments all on the 
first one to show up. 


was (Author: apurtell):
[~stack] reported the same thing with branch-2 on HBASE-17580.
 
Seems the decision we made to make FAILED_OPEN state something an operator has 
to resolve by hand, or by hbck (someday), or by master failover, is a nasty 
operational wrinkle when using RSgroups, because the possibility of all servers 
in a RSGroup going down is higher than for the whole cluster.

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: 
> Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
> node-5.cluster,16020,1509482700768, set to FAILED_OPEN
>  
> Any region in FAILED_OPEN state has to be manually reassigned, or the master 
> can be restarted and this will also cause reattempt of assignment of any 
> regions in FAILED_OPEN state. This is not unexpected but is an operational 
> headache. It would be better if the RSGroupInfoManager could automatically 
> kick reassignments of regions in FAILED_OPEN state when servers rejoin the 
> cluster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17580) Tooling to move table out of FAILED_OPEN state

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227730#comment-16227730
 ] 

Andrew Purtell commented on HBASE-17580:


Hey [~stack] I'm looking at handling this automatically on HBASE-19144.

And maybe more generally, we can take a look at kicking regions in FAILED_OPEN 
state whenever a server joins the cluster (with hysteresis so if a few join in 
a short while we don't pile all regions on the first one to show up)? 

> Tooling to move table out of FAILED_OPEN state
> --
>
> Key: HBASE-17580
> URL: https://issues.apache.org/jira/browse/HBASE-17580
> Project: HBase
>  Issue Type: Sub-task
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Priority: Critical
> Fix For: 2.0.0
>
>
> Playing with rsgroup feature, I was able to manufacture a state where 
> assignment landed regions into a FAILED_OPEN state; i.e. assignment could not 
> complete (in this case because no server for the region to land on inside its 
> rsgroup). Once in this FAILED_OPEN state, the only way currently (seemingly) 
> to move it on from this state is to manually assign the region. If table is 
> large, this is burdensome. FAILED_OPEN seems a legitmate endpoint for AM to 
> dump regions into if it can't do anything with them. We need tooling for 
> 'fix'. One thought is that assign take a tableName.
> Disable/Enable does not work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227729#comment-16227729
 ] 

Andrew Purtell commented on HBASE-19144:


[~stack] reported the same thing with branch-2 on HBASE-17580.
 
Seems the decision we made to make FAILED_OPEN state something an operator has 
to resolve by hand, or by hbck (someday), or by master failover, is a nasty 
operational wrinkle when using RSgroups, because the possibility of all servers 
in a RSGroup going down is higher than for the whole cluster.

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: 
> Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
> node-5.cluster,16020,1509482700768, set to FAILED_OPEN
>  
> Any region in FAILED_OPEN state has to be manually reassigned, or the master 
> can be restarted and this will also cause reattempt of assignment of any 
> regions in FAILED_OPEN state. This is not unexpected but is an operational 
> headache. It would be better if the RSGroupInfoManager could automatically 
> kick reassignments of regions in FAILED_OPEN state when servers rejoin the 
> cluster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227713#comment-16227713
 ] 

Andrew Purtell edited comment on HBASE-19144 at 10/31/17 10:40 PM:
---

bq. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

I threw together a hack (imagine this done with another registered 
ServerListener) which addresses the problem as reported, in that regions in 
FAILED_OPEN state due to constraint failures when a whole RSGroup is down are 
all reassigned, but am not sure this is the best way:

{code}
diff --git 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
 b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgro
up/RSGroupInfoManagerImpl.java
index 80eaefb036..d6a7ec120d 100644
--- 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
+++ 
b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
@@ -531,6 +535,19 @@ public class RSGroupInfoManagerImpl implements 
RSGroupInfoManager, ServerListene
 prevDefaultServers = servers;
 LOG.info("Updated with servers: "+servers.size());
   }
+
+  // Kick assignments that may be in FAILED_OPEN state
+  List failedAssignments = Lists.newArrayList();
+  for (RegionState state: 
+  
mgr.master.getAssignmentManager().getRegionStates().getRegionsInTransition()) {
+if (state.isFailedOpen()) {
+  failedAssignments.add(state.getRegion());
+}
+  }
+  for (HRegionInfo region: failedAssignments) {
+mgr.master.getAssignmentManager().unassign(region);
+  }
+
   try {
 synchronized (this) {
   if(!hasChanged) {
{code}

Testing was with branch-1.4 / branch-1. I also need to check how branch-2 
behaves. 

If doing this for real we should wait a bit in case more servers join, and do 
the work in batch.


was (Author: apurtell):
bq. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

I threw together a hack (imagine this done with another registered 
ServerListener) which addresses the problem as reported, in that regions in 
FAILED_STATE due to constraint failures when a whole RSGroup is down are all 
reassigned, but am not sure this is the best way:

{code}
diff --git 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
 b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgro
up/RSGroupInfoManagerImpl.java
index 80eaefb036..d6a7ec120d 100644
--- 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
+++ 
b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
@@ -531,6 +535,19 @@ public class RSGroupInfoManagerImpl implements 
RSGroupInfoManager, ServerListene
 prevDefaultServers = servers;
 LOG.info("Updated with servers: "+servers.size());
   }
+
+  // Kick assignments that may be in FAILED_OPEN state
+  List failedAssignments = Lists.newArrayList();
+  for (RegionState state: 
+  
mgr.master.getAssignmentManager().getRegionStates().getRegionsInTransition()) {
+if (state.isFailedOpen()) {
+  failedAssignments.add(state.getRegion());
+}
+  }
+  for (HRegionInfo region: failedAssignments) {
+mgr.master.getAssignmentManager().unassign(region);
+  }
+
   try {
 synchronized (this) {
   if(!hasChanged) {
{code}

Testing was with branch-1.4 / branch-1. I also need to check how branch-2 
behaves. 

If doing this for real we should wait a bit in case more servers join, and do 
the work in batch.

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> 

[jira] [Commented] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227722#comment-16227722
 ] 

Andrew Purtell commented on HBASE-19144:


Thanks [~abhishek.chouhan] for identifying the problem. 

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: 
> Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
> node-5.cluster,16020,1509482700768, set to FAILED_OPEN
>  
> Any region in FAILED_OPEN state has to be manually reassigned, or the master 
> can be restarted and this will also cause reattempt of assignment of any 
> regions in FAILED_OPEN state. This is not unexpected but is an operational 
> headache. It would be better if the RSGroupInfoManager could automatically 
> kick reassignments of regions in FAILED_OPEN state when servers rejoin the 
> cluster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227713#comment-16227713
 ] 

Andrew Purtell edited comment on HBASE-19144 at 10/31/17 10:36 PM:
---

bq. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

I threw together a hack (imagine this done with another registered 
ServerListener) which addresses the problem as reported, in that regions in 
FAILED_STATE due to constraint failures when a whole RSGroup is down are all 
reassigned, but am not sure this is the best way:

{code}
diff --git 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
 b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgro
up/RSGroupInfoManagerImpl.java
index 80eaefb036..d6a7ec120d 100644
--- 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
+++ 
b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
@@ -531,6 +535,19 @@ public class RSGroupInfoManagerImpl implements 
RSGroupInfoManager, ServerListene
 prevDefaultServers = servers;
 LOG.info("Updated with servers: "+servers.size());
   }
+
+  // Kick assignments that may be in FAILED_OPEN state
+  List failedAssignments = Lists.newArrayList();
+  for (RegionState state: 
+  
mgr.master.getAssignmentManager().getRegionStates().getRegionsInTransition()) {
+if (state.isFailedOpen()) {
+  failedAssignments.add(state.getRegion());
+}
+  }
+  for (HRegionInfo region: failedAssignments) {
+mgr.master.getAssignmentManager().unassign(region);
+  }
+
   try {
 synchronized (this) {
   if(!hasChanged) {
{code}

Testing was with branch-1.4 / branch-1. I also need to check how branch-2 
behaves. 

If doing this for real we should wait a bit in case more servers join, and do 
the work in batch.


was (Author: apurtell):
bq. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

I threw together a hack (imagine this done with another registered 
ServerListener) which addresses the problem as reported, in that regions in 
FAILED_STATE due to constraint failures when a whole RSGroup is down are all 
reassigned, but am not sure this is the best way:

{code}
diff --git 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
 b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgro
up/RSGroupInfoManagerImpl.java
index 80eaefb036..d6a7ec120d 100644
--- 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
+++ 
b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
@@ -531,6 +535,19 @@ public class RSGroupInfoManagerImpl implements 
RSGroupInfoManager, ServerListene
 prevDefaultServers = servers;
 LOG.info("Updated with servers: "+servers.size());
   }
+
+  // Kick assignments that may be in FAILED_OPEN state
+  List failedAssignments = Lists.newArrayList();
+  for (RegionState state: 
+  
mgr.master.getAssignmentManager().getRegionStates().getRegionsInTransition()) {
+if (state.isFailedOpen()) {
+  failedAssignments.add(state.getRegion());
+}
+  }
+  for (HRegionInfo region: failedAssignments) {
+mgr.master.getAssignmentManager().unassign(region);
+  }
+
   try {
 synchronized (this) {
   if(!hasChanged) {
{code}


Testing was with branch-1.4 / branch-1. I also need to check how branch-2 
behaves. 

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] 

[jira] [Commented] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227713#comment-16227713
 ] 

Andrew Purtell commented on HBASE-19144:


bq. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

I threw together a hack (imagine this done with another registered 
ServerListener) which addresses the problem as reported, in that regions in 
FAILED_STATE due to constraint failures when a whole RSGroup is down are all 
reassigned, but am not sure this is the best way:

{code}
diff --git 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
 b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgro
up/RSGroupInfoManagerImpl.java
index 80eaefb036..d6a7ec120d 100644
--- 
a/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
+++ 
b/hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java
@@ -531,6 +535,19 @@ public class RSGroupInfoManagerImpl implements 
RSGroupInfoManager, ServerListene
 prevDefaultServers = servers;
 LOG.info("Updated with servers: "+servers.size());
   }
+
+  // Kick assignments that may be in FAILED_OPEN state
+  List failedAssignments = Lists.newArrayList();
+  for (RegionState state: 
+  
mgr.master.getAssignmentManager().getRegionStates().getRegionsInTransition()) {
+if (state.isFailedOpen()) {
+  failedAssignments.add(state.getRegion());
+}
+  }
+  for (HRegionInfo region: failedAssignments) {
+mgr.master.getAssignmentManager().unassign(region);
+  }
+
   try {
 synchronized (this) {
   if(!hasChanged) {
{code}


Testing was with branch-1.4 / branch-1. I also need to check how branch-2 
behaves. 

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: 
> Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
> node-5.cluster,16020,1509482700768, set to FAILED_OPEN
>  
> Any region in FAILED_OPEN state has to be manually reassigned, or the master 
> can be restarted and this will also cause reattempt of assignment of any 
> regions in FAILED_OPEN state. This is not unexpected but is an operational 
> headache. It would be better if the RSGroupInfoManager could automatically 
> kick reassignments of regions in FAILED_OPEN state when servers rejoin the 
> cluster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227711#comment-16227711
 ] 

Lars Hofhansl commented on HBASE-12091:
---

The problem seems to be that ProtobufUtil.replicateWALEntry already unwraps any 
remote exception... But in HBaseInterClusterReplicationEndpoint.replicate we 
want to distinguish between remote and local exceptions.
So now we lost the ability to filter for specific remote exceptions. Different 
jira perhaps.

For this one it's annoying that the sink does not even throw a 
TableNotFoundException anymore. It seems the entire replication logic needs a 
review in 1.4 (and probably before in 1.3 as well).

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 1.4.0
>
> Attachments: 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-19144:
---
Description: 
After all servers in the RSgroup are down the regions cannot be opened anywhere 
and transition rapidly into FAILED_OPEN state.
 
2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
server=node-5.cluster,16020,1509482700768} to {c6c8150c9f4b8df25ba32073f25a5143 
state=FAILED_OPEN, ts=1509483985449, server=node-5.cluster,16020,1509482700768}
2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: Failed 
to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
node-5.cluster,16020,1509482700768, set to FAILED_OPEN
 
Any region in FAILED_OPEN state has to be manually reassigned, or the master 
can be restarted and this will also cause reattempt of assignment of any 
regions in FAILED_OPEN state. This is not unexpected but is an operational 
headache. It would be better if the RSGroupInfoManager could automatically kick 
reassignments of regions in FAILED_OPEN state when servers rejoin the cluster. 

> [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all 
> servers in the group are unavailable
> --
>
> Key: HBASE-19144
> URL: https://issues.apache.org/jira/browse/HBASE-19144
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
>Reporter: Andrew Purtell
>
> After all servers in the RSgroup are down the regions cannot be opened 
> anywhere and transition rapidly into FAILED_OPEN state.
>  
> 2017-10-31 21:06:25,449 INFO [ProcedureExecutor-13] master.RegionStates: 
> Transition {c6c8150c9f4b8df25ba32073f25a5143 state=OFFLINE, ts=1509483985448, 
> server=node-5.cluster,16020,1509482700768} to 
> {c6c8150c9f4b8df25ba32073f25a5143 state=FAILED_OPEN, ts=1509483985449, 
> server=node-5.cluster,16020,1509482700768}
> 2017-10-31 21:06:25,449 WARN [ProcedureExecutor-13] master.RegionStates: 
> Failed to open/close d4e2f173e31ffad6aac140f4bd7b02bc on 
> node-5.cluster,16020,1509482700768, set to FAILED_OPEN
>  
> Any region in FAILED_OPEN state has to be manually reassigned, or the master 
> can be restarted and this will also cause reattempt of assignment of any 
> regions in FAILED_OPEN state. This is not unexpected but is an operational 
> headache. It would be better if the RSGroupInfoManager could automatically 
> kick reassignments of regions in FAILED_OPEN state when servers rejoin the 
> cluster. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19144) [RSgroups] Regions assigned to a RSGroup all go to FAILED_OPEN state when all servers in the group are unavailable

2017-10-31 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-19144:
--

 Summary: [RSgroups] Regions assigned to a RSGroup all go to 
FAILED_OPEN state when all servers in the group are unavailable
 Key: HBASE-19144
 URL: https://issues.apache.org/jira/browse/HBASE-19144
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 3.0.0, 1.4.0, 1.5.0
Reporter: Andrew Purtell






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19141) [compat 1-2] getClusterStatus always return empty ClusterStatus

2017-10-31 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227706#comment-16227706
 ] 

Chia-Ping Tsai commented on HBASE-19141:


bq. Please check TestClientClusterStatus
My bad :(

>  [compat 1-2] getClusterStatus always return empty ClusterStatus
> 
>
> Key: HBASE-19141
> URL: https://issues.apache.org/jira/browse/HBASE-19141
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19141.v0.patch
>
>
> We are able to limit the scope to get part of {{ClusterStatus}} in 2.0. 
> However the request sent by 1.x client has no specific scope info to retrieve 
> any information from {{ClusterStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-19141) [compat 1-2] getClusterStatus always return empty ClusterStatus

2017-10-31 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227495#comment-16227495
 ] 

Ted Yu edited comment on HBASE-19141 at 10/31/17 10:21 PM:
---

Please check TestClientClusterStatus


was (Author: yuzhih...@gmail.com):
lgtm

>  [compat 1-2] getClusterStatus always return empty ClusterStatus
> 
>
> Key: HBASE-19141
> URL: https://issues.apache.org/jira/browse/HBASE-19141
> Project: HBase
>  Issue Type: Sub-task
>  Components: API
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-19141.v0.patch
>
>
> We are able to limit the scope to get part of {{ClusterStatus}} in 2.0. 
> However the request sent by 1.x client has no specific scope info to retrieve 
> any information from {{ClusterStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12091:
--
Fix Version/s: 1.4.0

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 1.4.0
>
> Attachments: 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19141) [compat 1-2] getClusterStatus always return empty ClusterStatus

2017-10-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227699#comment-16227699
 ] 

Hadoop QA commented on HBASE-19141:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
46s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
5s{color} | {color:red} hbase-server: The patch generated 1 new + 182 unchanged 
- 1 fixed = 183 total (was 183) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
41s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
48m  8s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 43s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 14s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestClientClusterStatus |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19141 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895067/HBASE-19141.v0.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux d6e26c7a2b66 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 9382f391c8 |
| Default Java | 1.8.0_141 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9544/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9544/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9544/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9544/console |
| 

[jira] [Commented] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-10-31 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227696#comment-16227696
 ] 

Ted Yu commented on HBASE-12125:


Can you put the patch on review board ?

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
> Attachments: HBASE-12125.v1.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227694#comment-16227694
 ] 

Andrew Purtell commented on HBASE-12091:


bq. Might be something to check for 1.4. Specifically it seems I cannot produce 
a test that shows that edits are ordered correctly in the face of dropped 
tables.

You'll have to help with that by providing more detail. 

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227463#comment-16227463
 ] 

Lars Hofhansl edited comment on HBASE-12091 at 10/31/17 10:13 PM:
--

Hmm... When trying a test I see the following, which incorrectly marks this as 
a local error.
{code}
2017-10-31 12:56:46,876 WARN  
[RS_OPEN_REGION-localhost:37372-0.replicationSource.localhost%2C37372%2C1509479790408,2]
 regionserver.HBaseInterClusterRe
plicationEndpoint(349): Can't replicate because of a local or network error: 
org.apache.hadoop.hbase.DoNotRetryIOException: Unable to instantiate exception 
received from server:org.apache.hadoop.hbase.client.RetriesExhaustedWith
DetailsException.(java.lang.String)
at 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:87)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:353)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:330)
at 
org.apache.hadoop.hbase.protobuf.ReplicationProtbufUtil.replicateWALEntry(ReplicationProtbufUtil.java:74)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.replicateEntries(HBaseInterClusterReplicati
onEndpoint.java:426)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.j
ava:445)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.j
ava:403)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
at java.lang.Thread.run(Thread.java:748)
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException):
 org.apache.hadoo
p.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: Table 
'test_dropped' was not found, got: test.: 1 time, servers with issues: null
at 
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:295)
at 
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$2300(AsyncProcess.java:271)
at 
org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.getErrors(AsyncProcess.java:1779)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:925)
at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:939)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSink.batch(ReplicationSink.java:376)
{code}


was (Author: lhofhansl):
Hmm... When trying a test I see the following, which incorrectly mark this as a 
local error.
{code}
2017-10-31 12:56:46,876 WARN  
[RS_OPEN_REGION-localhost:37372-0.replicationSource.localhost%2C37372%2C1509479790408,2]
 regionserver.HBaseInterClusterRe
plicationEndpoint(349): Can't replicate because of a local or network error: 
org.apache.hadoop.hbase.DoNotRetryIOException: Unable to instantiate exception 
received from server:org.apache.hadoop.hbase.client.RetriesExhaustedWith
DetailsException.(java.lang.String)
at 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException.unwrapRemoteException(RemoteWithExtrasException.java:87)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.makeIOExceptionOfException(ProtobufUtil.java:353)
at 
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:330)
at 
org.apache.hadoop.hbase.protobuf.ReplicationProtbufUtil.replicateWALEntry(ReplicationProtbufUtil.java:74)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.replicateEntries(HBaseInterClusterReplicati
onEndpoint.java:426)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.j
ava:445)
at 
org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint$Replicator.call(HBaseInterClusterReplicationEndpoint.j
ava:403)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152)
at 

[jira] [Commented] (HBASE-19114) Split out o.a.h.h.zookeeper from hbase-server and hbase-client

2017-10-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227693#comment-16227693
 ] 

Hadoop QA commented on HBASE-19114:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 91 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  7m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
47s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  6m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
27s{color} | {color:red} hbase-common: The patch generated 10 new + 4 unchanged 
- 0 fixed = 14 total (was 4) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-zookeeper: The patch generated 204 new + 0 
unchanged - 0 fixed = 204 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  2m 
42s{color} | {color:red} root: The patch generated 378 new + 2518 unchanged - 
286 fixed = 2896 total (was 2804) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-client-project passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-shaded-client-project passed 
checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} The patch hbase-assembly passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
33s{color} | {color:red} hbase-client: The patch generated 15 new + 758 
unchanged - 173 fixed = 773 total (was 931) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} hbase-endpoint: The patch generated 0 new + 2 
unchanged - 1 fixed = 2 total (was 3) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch hbase-examples passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} The patch hbase-it passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} The patch hbase-mapreduce passed checkstyle {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} hbase-replication: The patch generated 12 new + 30 
unchanged - 5 fixed = 42 total (was 35) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} hbase-rsgroup: The patch generated 4 new + 8 unchanged 

[jira] [Comment Edited] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227684#comment-16227684
 ] 

Lars Hofhansl edited comment on HBASE-12091 at 10/31/17 10:06 PM:
--

[~apurtell], FYI. Might be something to check for 1.4. Specifically it seems I 
cannot produce a test that shows that edits are ordered correctly in the face 
of dropped tables.


was (Author: lhofhansl):
[~apurtell], FYI. Might be something to check to 1.4. Specifically it seems I 
cannot produce a test that shows that edits are ordered correctly in the face 
of dropped tables.

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12091) Optionally ignore edits for dropped tables for replication.

2017-10-31 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227684#comment-16227684
 ] 

Lars Hofhansl commented on HBASE-12091:
---

[~apurtell], FYI. Might be something to check to 1.4. Specifically it seems I 
cannot produce a test that shows that edits are ordered correctly in the face 
of dropped tables.

> Optionally ignore edits for dropped tables for replication.
> ---
>
> Key: HBASE-12091
> URL: https://issues.apache.org/jira/browse/HBASE-12091
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 12091.txt
>
>
> We just ran into a scenario where we dropped a table from both the source and 
> the sink, but the source still has outstanding edits that now it could not 
> get rid of. Now all replication is backed up behind these unreplicatable 
> edits.
> We should have an option to ignore edits for tables dropped at the source.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18925) Need updated mockito for using java optional

2017-10-31 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227683#comment-16227683
 ] 

Appy commented on HBASE-18925:
--

Ready for review [~mdrob] and [~stack].
Uploaded new patch fixing checkstyles.

> Need updated mockito for using java optional
> 
>
> Key: HBASE-18925
> URL: https://issues.apache.org/jira/browse/HBASE-18925
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18925.master.001.patch, 
> HBASE-18925.master.002.patch, HBASE-18925.master.002.patch, 
> HBASE-18925.master.003.patch, HBASE-18925.master.004.patch, 
> HBASE-18925.master.005.patch, HBASE-18925.master.006.patch, 
> HBASE-18925.master.007.patch, HBASE-18925.master.008.patch, 
> HBASE-18925.master.009.patch
>
>
> Came up when i was trying to test HBASE-18878.
> It kept failing because mock of RpcCall returned null where return type was 
> Optional.
> Instead, we want it to return Optional.empty(). 
> New mockito versions support this (and other java8 things) - 
> https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2
> We use mockito-all which was last released in Dec2014. However, mockito-core 
> has had more than 50 releases after that 
> (https://mvnrepository.com/artifact/org.mockito/mockito-core). 
> We need to change our deps from mockito-all to mockito-core.
> However that comes with fair breakages, so this is not a simple task of 
> changing pom files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19138) Rare failure in TestLruBlockCache

2017-10-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227680#comment-16227680
 ] 

Hadoop QA commented on HBASE-19138:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
16s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
55s{color} | {color:red} hbase-server: The patch generated 1 new + 3 unchanged 
- 1 fixed = 4 total (was 4) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
19s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
44m 44s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 97m 
24s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
14s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:eee3b01 |
| JIRA Issue | HBASE-19138 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895037/HBASE-19138.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 6d2fad69056f 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 63ad16af0c |
| Default Java | 1.8.0_141 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9539/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9539/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/9539/console |
| Powered by | Apache Yetus 0.5.0   http://yetus.apache.org |


This message was automatically generated.



> Rare failure in TestLruBlockCache
> -
>
> Key: HBASE-19138
>   

[jira] [Updated] (HBASE-18925) Need updated mockito for using java optional

2017-10-31 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-18925:
-
Attachment: HBASE-18925.master.009.patch

> Need updated mockito for using java optional
> 
>
> Key: HBASE-18925
> URL: https://issues.apache.org/jira/browse/HBASE-18925
> Project: HBase
>  Issue Type: Improvement
>Reporter: Appy
>Assignee: Appy
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18925.master.001.patch, 
> HBASE-18925.master.002.patch, HBASE-18925.master.002.patch, 
> HBASE-18925.master.003.patch, HBASE-18925.master.004.patch, 
> HBASE-18925.master.005.patch, HBASE-18925.master.006.patch, 
> HBASE-18925.master.007.patch, HBASE-18925.master.008.patch, 
> HBASE-18925.master.009.patch
>
>
> Came up when i was trying to test HBASE-18878.
> It kept failing because mock of RpcCall returned null where return type was 
> Optional.
> Instead, we want it to return Optional.empty(). 
> New mockito versions support this (and other java8 things) - 
> https://github.com/mockito/mockito/wiki/What%27s-new-in-Mockito-2
> We use mockito-all which was last released in Dec2014. However, mockito-core 
> has had more than 50 releases after that 
> (https://mvnrepository.com/artifact/org.mockito/mockito-core). 
> We need to change our deps from mockito-all to mockito-core.
> However that comes with fair breakages, so this is not a simple task of 
> changing pom files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19121) HBCK for AMv2

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227656#comment-16227656
 ] 

Mike Drob commented on HBASE-19121:
---

or of HBASE-14620

> HBCK for AMv2
> -
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Sub-task
>  Components: hbck
>Reporter: stack
> Fix For: 2.0.0-beta-1
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19121) HBCK for AMv2

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227654#comment-16227654
 ] 

Mike Drob commented on HBASE-19121:
---

dup of HBASE-18792?

> HBCK for AMv2
> -
>
> Key: HBASE-19121
> URL: https://issues.apache.org/jira/browse/HBASE-19121
> Project: HBase
>  Issue Type: Sub-task
>  Components: hbck
>Reporter: stack
> Fix For: 2.0.0-beta-1
>
>
> We don't have an hbck for the new AM. Old hbck may actually do damage going 
> against AMv2.
> Fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-10-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated HBASE-12125:
-
Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
> Attachments: HBASE-12125.v1.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-10-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated HBASE-12125:
-
Attachment: HBASE-12125.v1.master.patch

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 3.0.0
>Reporter: Virag Kothari
>Assignee: Vincent Poon
> Attachments: HBASE-12125.v1.master.patch
>
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19128) Purge Distributed Log Replay from codebase, configurations, text; mark the feature as unsupported, broken.

2017-10-31 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227643#comment-16227643
 ] 

Appy commented on HBASE-19128:
--

If we are deleting the code, that means we'll have to call out following 
prerequisites for cluster upgrade, right?
- Disable distributed log replay before upgrading
- Make sure that all recoveries which started when DLR was turned on are 
complete. (Else cluster might result in bad state, or even data loss!)

Does it makes sense [~busbey], [~stack]?

> Purge Distributed Log Replay from codebase, configurations, text; mark the 
> feature as unsupported, broken.
> --
>
> Key: HBASE-19128
> URL: https://issues.apache.org/jira/browse/HBASE-19128
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: stack
>  Labels: incompatible
> Fix For: 2.0.0
>
>
> Kill it. It keeps coming up and over again. Needs proper burial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19143) Add has(Option) to ClusterStatus

2017-10-31 Thread Chia-Ping Tsai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai updated HBASE-19143:
---
Fix Version/s: 2.0.0-beta-1

> Add has(Option) to ClusterStatus
> 
>
> Key: HBASE-19143
> URL: https://issues.apache.org/jira/browse/HBASE-19143
> Project: HBase
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
>
> It helps user to distinguish between nothing and you-do-not-ask.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-12125) Add Hbck option to check and fix WAL's from replication queue

2017-10-31 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon reassigned HBASE-12125:


Assignee: Vincent Poon  (was: Virag Kothari)

> Add Hbck option to check and fix WAL's from replication queue
> -
>
> Key: HBASE-12125
> URL: https://issues.apache.org/jira/browse/HBASE-12125
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Virag Kothari
>Assignee: Vincent Poon
>
> The replication source will discard the WAL file in many cases when it 
> encounters an exception reading it . This can cause data loss
> and the underlying reason of failed read remains hidden.  Only in certain 
> scenarios, the replication source should dump the current WAL and move to the 
> next one. 
> This JIRA aims to have an hbck option to check the WAL files of replication 
> queues for any inconsistencies and also provide an option to fix it.
> The fix can be to remove the file from replication queue in zk and from the 
> memory of replication source manager and replication sources. 
> A region server endpoint call from the hbck client to region server can be 
> used to achieve this.
> Hbck can be configured with the following options:
> -softCheckReplicationWAL : Tries to open only the oldest WAL (the WAL 
> currently read by replication source) from replication queue. If there is a 
> position associated, it also seeks to that position and reads an entry from 
> there
> -hardCheckReplicationWAL:  Check all WAL paths from replication queues by 
> reading them completely to make sure they are ok.
> -fixMissingReplicationWAL: Remove the WAL's from replication queues which are 
> not present on hdfs
> -fixCorruptedReplicationWAL:  Remove the WAL's from replication queues which 
> are corrupted (based on the findings from softCheck/hardCheck). Also the 
> WAL's are moved to a quarantine dir
> -rollAndFixCorruptedReplicationWAL - If the current WAL is corrupted, it is 
> first rolled over and then deals with it in the same way as 
> -fixCorruptedReplicationWAL option



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19143) Add has(Option) to ClusterStatus

2017-10-31 Thread Chia-Ping Tsai (JIRA)
Chia-Ping Tsai created HBASE-19143:
--

 Summary: Add has(Option) to ClusterStatus
 Key: HBASE-19143
 URL: https://issues.apache.org/jira/browse/HBASE-19143
 Project: HBase
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


It helps user to distinguish between nothing and you-do-not-ask.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19126) [AMv2] RegionStates/RegionStateNode needs cleanup

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227636#comment-16227636
 ] 

Mike Drob commented on HBASE-19126:
---

[~easyliangjob] - do you want to work on this one?

> [AMv2] RegionStates/RegionStateNode needs cleanup
> -
>
> Key: HBASE-19126
> URL: https://issues.apache.org/jira/browse/HBASE-19126
> Project: HBase
>  Issue Type: Improvement
>Reporter: Yi Liang
> Fix For: 2.0.0-beta-1
>
>
>   // Mutable/Immutable? Changes have to be synchronized or not?
>   // Data members are volatile which seems to say multi-threaded access is 
> fine.
>   // In the below we do check and set but the check state could change before
>   // we do the set because no synchronizationwhich seems dodgy. Clear up
>   // understanding here... how many threads accessing? Do locks make it so one
>   // thread at a time working on a single Region's RegionStateNode? Lets 
> presume
>   // so for now. Odd is that elsewhere in this RegionStates, we synchronize on
>   // the RegionStateNode instance
> Copied from TODO in RegionState.java
> Open this jira to track some cleanups for RegionStates/RegionStateNode



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-19122) preCompact and preFlush can bypass by returning null scanner; shut it down

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-19122:
-

Assignee: stack

> preCompact and preFlush can bypass by returning null scanner; shut it down
> --
>
> Key: HBASE-19122
> URL: https://issues.apache.org/jira/browse/HBASE-19122
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors, Scanners
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
>
> Noticed by [~anoop.hbase] during review of HBASE-18770, preCompact and 
> preFlush can bypass normal processing by returning null. They are not 
> bypasable by ordained route. We should shut down this avenue.
> The preCompact at least may be new coming in with:
> {code}
> tree dbf13093842f85a713f023d7219caccf8f4eb05f
> parent a4dcf51415616772e462091ce93622f070ea8810
> author zhangduo  Sat Apr 9 16:18:08 2016 +0800
> committer zhangduo  Sun Apr 10 09:26:28 2016 +0800
> HBASE-15527 Refactor Compactor related classes
> {code}
> Would have to dig in more to figure for sure.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-19116) Currently the tail of hfiles with CellComparator* classname makes it so hbase1 can't open hbase2 written hfiles; fix

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-19116:
-

Assignee: stack

> Currently the tail of hfiles with CellComparator* classname makes it so 
> hbase1 can't open hbase2 written hfiles; fix
> 
>
> Key: HBASE-19116
> URL: https://issues.apache.org/jira/browse/HBASE-19116
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, migration
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
>
> See tail of HBASE-19052 for discussion which concludes we should try and make 
> it so operators do not have to go to latest hbase version before they 
> upgrade, at least if we can avoid it.
> The necessary change of our default comparator from KV to Cell naming has 
> hfiles with tails that have the classname CellComparator in them in place of 
> KeyValueComparator. If an hbase1 tries to open them, it will fail not having 
> a CellComparator in its classpath (We have name of comparator in tail because 
> different files require different comparators... perhaps we write an alias 
> instead of a class one day... TODO). HBASE-16189 and HBASE-19052 are about 
> trying to carry knowledge of hbase2 back to hbase1, a brittle approach making 
> it so operators will have to upgrade to the latest branch-1 before they can 
> go to hbase2.
> This issue is about undoing our writing of an incompatible (to hbase1) tail, 
> not unless we really have to (and it sounds like we could do without writing 
> an incompatible tail) to see if we can avoid requiring operators go to 
> lastest branch-1 (we may end up needing this but lets a have a really good 
> reason for it if we do).
> Oh, let this filing be an answer to our [~anoop.hbase]'s old high-level 
> question over in HBASE-16189:
> bq. ...means when rolling upgrade done to 2.0, first users have to upgrade to 
> some 1.x versions which is having this fix and then to 2.0.. What do you guys 
> think Whether we should avoid this kind of indirection? cc Enis Soztutar, 
> Stack, Ted Yu, Matteo Bertozzi
> Yeah, lets try to avoid this if we can...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-18970) The version of jruby we use now can't get interactive input from prompt

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-18970:
-

Assignee: Mike Drob

> The version of jruby we use now can't get interactive input from prompt
> ---
>
> Key: HBASE-18970
> URL: https://issues.apache.org/jira/browse/HBASE-18970
> Project: HBase
>  Issue Type: Task
>  Components: shell
>Reporter: Chia-Ping Tsai
>Assignee: Mike Drob
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
>
> case 1: press enter
> {code}
> hbase(main):002:0> disable_all 'chia(.*)'
> chia_1
>   
>
> Disable the above 1 tables (y/n)?
> y^M
> {code}
> case 2: press ctrl-j
> {code}
> hbase(main):001:0> disable_all 'chia(.*)'
> chia_1
>   
>
> Disable the above 1 tables (y/n)?
> y^J1 tables successfully disabled
> Took 5.0059 seconds  
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19128) Purge Distributed Log Replay from codebase, configurations, text; mark the feature as unsupported, broken.

2017-10-31 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19128:
-
Release Note: Disable Distributed Log Replay before upgrading since it has 
been removed in 2.0

> Purge Distributed Log Replay from codebase, configurations, text; mark the 
> feature as unsupported, broken.
> --
>
> Key: HBASE-19128
> URL: https://issues.apache.org/jira/browse/HBASE-19128
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: stack
>  Labels: incompatible
> Fix For: 2.0.0
>
>
> Kill it. It keeps coming up and over again. Needs proper burial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17177) Compaction can break the region/row level atomic when scan even if we pass mvcc to client

2017-10-31 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227634#comment-16227634
 ] 

stack commented on HBASE-17177:
---

What you reckon for this and hbase2 [~Apache9]

> Compaction can break the region/row level atomic when scan even if we pass 
> mvcc to client
> -
>
> Key: HBASE-17177
> URL: https://issues.apache.org/jira/browse/HBASE-17177
> Project: HBase
>  Issue Type: Sub-task
>  Components: scan
>Reporter: Duo Zhang
>Priority: Critical
> Fix For: 1.5.0, 2.0.0-beta-1
>
>
> We know that major compaction will actually delete the cells which are 
> deleted by a delete marker. In order to give a consistent view for a scan, we 
> need to use a map to track the read points for all scanners for a region, and 
> the smallest one will be used for a compaction. For all delete markers whose 
> mvcc is greater than this value, we will not use it to delete other cells.
> And the problem for a scan restart after region move is that, the new RS does 
> not have the information of the scanners opened at the old RS before the 
> client sends scan requests to the new RS which means the read points map is 
> incomplete and the smallest read point maybe greater than the correct value. 
> So if a major compaction happens at that time, it may delete some cells which 
> should be kept.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19128) Purge Distributed Log Replay from codebase, configurations, text; mark the feature as unsupported, broken.

2017-10-31 Thread Appy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Appy updated HBASE-19128:
-
Labels: incompatible  (was: )

> Purge Distributed Log Replay from codebase, configurations, text; mark the 
> feature as unsupported, broken.
> --
>
> Key: HBASE-19128
> URL: https://issues.apache.org/jira/browse/HBASE-19128
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: stack
>  Labels: incompatible
> Fix For: 2.0.0
>
>
> Kill it. It keeps coming up and over again. Needs proper burial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19112) Suspect methods on Cell to be deprecated

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227631#comment-16227631
 ] 

Mike Drob commented on HBASE-19112:
---

[~elserj] - do you want to work on this?

> Suspect methods on Cell to be deprecated
> 
>
> Key: HBASE-19112
> URL: https://issues.apache.org/jira/browse/HBASE-19112
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Reporter: Josh Elser
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
>
> [~chia7712] suggested on the [mailing 
> list|https://lists.apache.org/thread.html/e6de9af26d9b888a358ba48bf74655ccd893573087c032c0fcf01585@%3Cdev.hbase.apache.org%3E]
>  that we have some methods on Cell which should be deprecated for removal:
> * {{#getType()}}
> * {{#getTimestamp()}}
> * {{#getTag()}}
> * {{#getSequenceId()}}
> Let's make a pass over these (and maybe the rest) to make sure that there 
> aren't others which are either implementation details or methods returning 
> now-private-marked classes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-18110) [AMv2] Reenable tests temporarily disabled

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob reassigned HBASE-18110:
-

Assignee: stack

> [AMv2] Reenable tests temporarily disabled
> --
>
> Key: HBASE-18110
> URL: https://issues.apache.org/jira/browse/HBASE-18110
> Project: HBase
>  Issue Type: Sub-task
>  Components: Region Assignment
>Affects Versions: 2.0.0
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 2.0.0-beta-1
>
>
> We disabled tests that didn't make sense or relied on behavior not supported 
> by AMv2. Revisit and reenable after AMv2 gets committed. Here is the set 
> (from 
> https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.rsj53tx4vlwj)
> testAllFavoredNodesDead and testAllFavoredNodesDeadMasterRestarted and 
> testMisplacedRegions in TestFavoredStochasticLoadBalancer … not sure what 
> this about.
> testRegionNormalizationMergeOnCluster in TestSimpleRegionNormalizerOnCluster 
> disabled for now till we fix up Merge.
> testMergeWithReplicas in TestRegionMergeTransactionOnCluster because don't 
> know how it is supposed to work.
> Admin#close does not update Master. Causes 
> testHBaseFsckWithFewerMetaReplicaZnodes in TestMetaWithReplicas to fail 
> (Master gets report about server closing when it didn’t run the close -- gets 
> freaked out).
> Disabled/Ignore TestRSGroupsOfflineMode#testOffline; need to dig in on what 
> offline is.
> Disabled/Ignore TestRSGroups.
> All tests that have to do w/ fsck:TestHBaseFsckTwoRS, 
> TestOfflineMetaRebuildBase TestHBaseFsckReplicas, 
> TestOfflineMetaRebuildOverlap, testChangingReplicaCount in 
> TestMetaWithReplicas (internally it is doing fscks which are killing RS)...
> FSCK test testHBaseFsckWithExcessMetaReplicas in TestMetaWithReplicas.
> So is testHBaseFsckWithFewerMetaReplicas in same class.
> TestHBaseFsckOneRS is fsck. Disabled.
> TestOfflineMetaRebuildHole is about rebuilding hole with fsck.
> Master carries meta:
> TestRegionRebalancing is disabled because doesn't consider the fact that 
> Master carries system tables only (fix of average in RegionStates brought out 
> the issue).
> Disabled testMetaAddressChange in TestMetaWithReplicas because presumes can 
> move meta... you can't
> TestAsyncTableGetMultiThreaded wants to move hbase:meta...Balancer does NPEs. 
> AMv2 won't let you move hbase:meta off Master.
> Disabled parts of...testCreateTableWithMultipleReplicas in 
> TestMasterOperationsForRegionReplicas There is an issue w/ assigning more 
> replicas if number of replicas is changed on us. See '/* DISABLED! FOR 
> NOW'.
> Disabled TestCorruptedRegionStoreFile. Depends on a half-implemented reopen 
> of a region when a store file goes missing; TODO.
> testRetainAssignmentOnRestart in TestRestartCluster does not work. AMv2 does 
> retain semantic differently. Fix. TODO.
> TestMasterFailover needs to be rewritten for AMv2. It uses tricks not 
> ordained when up on AMv2. The test is also hobbled by fact that we 
> religiously enforce that only master can carry meta, something we are lose 
> about in old AM.
> Fix Ignores in TestServerCrashProcedure. Master is different now.
> Offlining is done differently now: Because of this disabled testOfflineRegion 
> in TestAsyncRegionAdminApi
> Skipping delete of table after test in TestAccessController3 because of 
> access issues w/ AMv2. AMv1 seems to crash servers on exit too for same lack 
> of auth perms but AMv2 gets hung up. TODO. See cleanUp method.
> TestHCM#testMulti and TestHCM
> Fix TestMasterMetrics. Stuff is different now around startup which messes up 
> this test. Disabled two of three tests.
> I tried to fix TestMasterBalanceThrottling but it looks like 
> SimpleLoadBalancer is borked whether AMv2 or not.
> Disabled testPickers in TestFavoredStochasticBalancerPickers. It hangs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19050) Improve javadoc of Scan#setMvccReadPoint

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19050:
--
Fix Version/s: (was: 2.0.0-beta-1)
   2.0.0

> Improve javadoc of Scan#setMvccReadPoint
> 
>
> Key: HBASE-19050
> URL: https://issues.apache.org/jira/browse/HBASE-19050
> Project: HBase
>  Issue Type: Task
>  Components: scan
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 2.0.0
>
>
> This is to improve the javadoc of scan#setMvccReadPoint. Ya it was added for 
> internal thing so that the scan is able to use the mvcc point. 
> But the java doc is not explicit and does not tell the importance of using it 
> and I think internally the code just ignores it. I can confirm that once 
> where exactly it happens but javadoc should  highlight it I believe. 
> At least we should mention that user setting any value will be ignored.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-17591) Update documentation to say distributed log replay defaults to 'false'

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-17591:
--
Fix Version/s: (was: 2.0.0-beta-1)
   2.0.0

> Update documentation to say distributed log replay defaults to 'false'
> --
>
> Key: HBASE-17591
> URL: https://issues.apache.org/jira/browse/HBASE-17591
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Aki Ariga
>Priority: Trivial
> Fix For: 2.0.0
>
>
> As consequence of HBASE-12577, `hbase.master.distributed.log.replay` is no 
> longer default true. But in the documentation, it is still noted as default 
> true as follows:
> {quote}
> To enable distributed log replay, set hbase.master.distributed.log.replay to 
> true. This will be the default for HBase 0.99 (HBASE-10888).
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17591) Update documentation to say distributed log replay defaults to 'false'

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227601#comment-16227601
 ] 

Mike Drob commented on HBASE-17591:
---

retargeting doc updates for 2.0GA

> Update documentation to say distributed log replay defaults to 'false'
> --
>
> Key: HBASE-17591
> URL: https://issues.apache.org/jira/browse/HBASE-17591
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Aki Ariga
>Priority: Trivial
> Fix For: 2.0.0
>
>
> As consequence of HBASE-12577, `hbase.master.distributed.log.replay` is no 
> longer default true. But in the documentation, it is still noted as default 
> true as follows:
> {quote}
> To enable distributed log replay, set hbase.master.distributed.log.replay to 
> true. This will be the default for HBase 0.99 (HBASE-10888).
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-19050) Improve javadoc of Scan#setMvccReadPoint

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-19050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227599#comment-16227599
 ] 

Mike Drob commented on HBASE-19050:
---

retargeting doc updates for 2.0GA

> Improve javadoc of Scan#setMvccReadPoint
> 
>
> Key: HBASE-19050
> URL: https://issues.apache.org/jira/browse/HBASE-19050
> Project: HBase
>  Issue Type: Task
>  Components: scan
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Fix For: 2.0.0
>
>
> This is to improve the javadoc of scan#setMvccReadPoint. Ya it was added for 
> internal thing so that the scan is able to use the mvcc point. 
> But the java doc is not explicit and does not tell the importance of using it 
> and I think internally the code just ignores it. I can confirm that once 
> where exactly it happens but javadoc should  highlight it I believe. 
> At least we should mention that user setting any value will be ignored.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-19128) Purge Distributed Log Replay from codebase, configurations, text; mark the feature as unsupported, broken.

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-19128:
--
Fix Version/s: (was: 2.0.0-beta-1)
   2.0.0

> Purge Distributed Log Replay from codebase, configurations, text; mark the 
> feature as unsupported, broken.
> --
>
> Key: HBASE-19128
> URL: https://issues.apache.org/jira/browse/HBASE-19128
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: stack
> Fix For: 2.0.0
>
>
> Kill it. It keeps coming up and over again. Needs proper burial.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18970) The version of jruby we use now can't get interactive input from prompt

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18970:
--
Summary: The version of jruby we use now can't get interactive input from 
prompt  (was: The version of jruby we use now can't handle the ctrl-m)

> The version of jruby we use now can't get interactive input from prompt
> ---
>
> Key: HBASE-18970
> URL: https://issues.apache.org/jira/browse/HBASE-18970
> Project: HBase
>  Issue Type: Task
>  Components: shell
>Reporter: Chia-Ping Tsai
> Fix For: 2.0.0-beta-1
>
>
> case 1: press enter
> {code}
> hbase(main):002:0> disable_all 'chia(.*)'
> chia_1
>   
>
> Disable the above 1 tables (y/n)?
> y^M
> {code}
> case 2: press ctrl-j
> {code}
> hbase(main):001:0> disable_all 'chia(.*)'
> chia_1
>   
>
> Disable the above 1 tables (y/n)?
> y^J1 tables successfully disabled
> Took 5.0059 seconds  
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18970) The version of jruby we use now can't get interactive input from prompt

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18970:
--
Priority: Critical  (was: Major)

> The version of jruby we use now can't get interactive input from prompt
> ---
>
> Key: HBASE-18970
> URL: https://issues.apache.org/jira/browse/HBASE-18970
> Project: HBase
>  Issue Type: Task
>  Components: shell
>Reporter: Chia-Ping Tsai
>Priority: Critical
> Fix For: 2.0.0-beta-1
>
>
> case 1: press enter
> {code}
> hbase(main):002:0> disable_all 'chia(.*)'
> chia_1
>   
>
> Disable the above 1 tables (y/n)?
> y^M
> {code}
> case 2: press ctrl-j
> {code}
> hbase(main):001:0> disable_all 'chia(.*)'
> chia_1
>   
>
> Disable the above 1 tables (y/n)?
> y^J1 tables successfully disabled
> Took 5.0059 seconds  
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18912) Update Admin methods to return Lists instead of arrays

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227576#comment-16227576
 ] 

Mike Drob commented on HBASE-18912:
---

[~zghaobac], [~psomogyi] - what's going on here? Is this doable for 2.0 or does 
it need to get pushed out?

> Update Admin methods to return Lists instead of arrays
> --
>
> Key: HBASE-18912
> URL: https://issues.apache.org/jira/browse/HBASE-18912
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Fix For: 2.0.0-beta-1
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18800) [Propaganda] Push shaded hbase-client as gateway to an hbase cluster going forward

2017-10-31 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16227567#comment-16227567
 ] 

Mike Drob commented on HBASE-18800:
---

[~stack] did that dev list discussion ever materialize? I can't find it...

Is the deliverable here a doc update?

> [Propaganda] Push shaded hbase-client as gateway to an hbase cluster going 
> forward
> --
>
> Key: HBASE-18800
> URL: https://issues.apache.org/jira/browse/HBASE-18800
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
> Fix For: 2.0.0-beta-1
>
>
> We've bandied this about for a while now that folks should consume hbase via 
> the shaded hbase-client; it should work if their needs are minimal (and if it 
> doesn't work, would be good to hear why). This issue is about evangelizing 
> the shaded hbase-client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18800) [Propaganda] Push shaded hbase-client as gateway to an hbase cluster going forward

2017-10-31 Thread Mike Drob (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mike Drob updated HBASE-18800:
--
Priority: Minor  (was: Major)

> [Propaganda] Push shaded hbase-client as gateway to an hbase cluster going 
> forward
> --
>
> Key: HBASE-18800
> URL: https://issues.apache.org/jira/browse/HBASE-18800
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Minor
> Fix For: 2.0.0-beta-1
>
>
> We've bandied this about for a while now that folks should consume hbase via 
> the shaded hbase-client; it should work if their needs are minimal (and if it 
> doesn't work, would be good to hear why). This issue is about evangelizing 
> the shaded hbase-client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   3   >