[jira] [Created] (HBASE-26029) It is not reliable to use nodeDeleted event to track region server's death

2021-06-23 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-26029:
-

 Summary: It is not reliable to use nodeDeleted event to track 
region server's death
 Key: HBASE-26029
 URL: https://issues.apache.org/jira/browse/HBASE-26029
 Project: HBase
  Issue Type: Bug
Reporter: Duo Zhang
Assignee: Duo Zhang


When implementing HBASE-26011, [~sunxin] pointed out an interesting scenario, 
where a region server up and down between two sync requests, then we can not 
know the death of the region server.

This is a valid point, and when thinking of a solution, I noticed that, the 
current zk iplementation has the same problem. Notice that, a watcher on zk can 
only be triggered once, so after zk triggers the watcher, and before you set a 
new watcher, it is possible that a region server is up and down, and you will 
miss the nodeDeleted event for this region server.

I think, the general approach here, which could works for both master based and 
zk based replication tracker is that, we should not rely on the tracker to tell 
you which region server is dead. Instead, we just provide the list of live 
regionservers, and the upper layer should compare this list with the expected 
list(for replication, the list should be gotten by listing replicators), to 
detect the dead region servers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26020) Split TestWALEntryStream.testDifferentCounts out

2021-06-23 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26020.
---
Fix Version/s: 2.4.5
   2.5.0
   3.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to branch-2.4+.

Thank [~haxiaolin] for reviewing.

The code on branch-2.3 is different so not easy to cherry pick, give up for now.

> Split TestWALEntryStream.testDifferentCounts out
> 
>
> Key: HBASE-26020
> URL: https://issues.apache.org/jira/browse/HBASE-26020
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication, test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.5
>
>
> It consumes too much time and may cause the whole UT to timeout.
> And in fact, it should be implemented as parameterized.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26028) The viewJson page show exception when using TinyLfuBlockCache

2021-06-23 Thread Zheng Wang (Jira)
Zheng Wang created HBASE-26028:
--

 Summary: The viewJson page show exception when using 
TinyLfuBlockCache
 Key: HBASE-26028
 URL: https://issues.apache.org/jira/browse/HBASE-26028
 Project: HBase
  Issue Type: Bug
  Components: UI
Reporter: Zheng Wang
Assignee: Zheng Wang


Some variable in TinyLfuBlockCache should be marked as 

transient.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26027) The calling of HTable.batch blocked at AsyncRequestFutureImpl.waitUntilDone caused by ArrayStoreException

2021-06-23 Thread Zheng Wang (Jira)
Zheng Wang created HBASE-26027:
--

 Summary: The calling of HTable.batch blocked at 
AsyncRequestFutureImpl.waitUntilDone caused by ArrayStoreException
 Key: HBASE-26027
 URL: https://issues.apache.org/jira/browse/HBASE-26027
 Project: HBase
  Issue Type: Bug
  Components: Client
Reporter: Zheng Wang
Assignee: Zheng Wang


The batch api of HTable contains a param named results to store result or 
exception, its type is Object[].

If user pass an array with other type, eg: 
org.apache.hadoop.hbase.client.Result, then the ArrayStoreException will occur 
in AsyncRequestFutureImpl.updateResult, then the 
AsyncRequestFutureImpl.decActionCounter will be skipped, then in the 
AsyncRequestFutureImpl.waitUntilDone we will stuck at here checking the 
actionsInProgress again and again, can not back.

It is better to add an cutoff calculated by operationTimeout, instead of only 
depend on the value of actionsInProgress.


{code:java}
[ERROR] [2021/06/22 23:23:00,676] hconnection-0x6b927fb-shared-pool3-t1 - id=1 
error for test processing localhost,16020,1624343786295
java.lang.ArrayStoreException: org.apache.hadoop.hbase.DoNotRetryIOException
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.updateResult(AsyncRequestFutureImpl.java:1242)
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.trySetResultSimple(AsyncRequestFutureImpl.java:1087)
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.setError(AsyncRequestFutureImpl.java:1021)
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.manageError(AsyncRequestFutureImpl.java:683)
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.receiveGlobalFailure(AsyncRequestFutureImpl.java:716)
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl.access$1500(AsyncRequestFutureImpl.java:69)
at 
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncRequestFutureImpl.java:219)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
at java.util.concurrent.FutureTask.run(FutureTask.java)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[INFO ] [2021/06/22 23:23:10,375] main - #1, waiting for 10  actions to finish 
on table: test
[INFO ] [2021/06/22 23:23:20,378] main - #1, waiting for 10  actions to finish 
on table: test
[INFO ] [2021/06/22 23:23:30,384] main - #1, waiting for 10  actions to finish 
on table: 
[INFO ] [2021/06/22 23:23:40,387] main - #1, waiting for 10  actions to finish 
on table: test
[INFO ] [2021/06/22 23:23:50,397] main - #1, waiting for 10  actions to finish 
on table: test
[INFO ] [2021/06/22 23:24:00,400] main - #1, waiting for 10  actions to finish 
on table: test
[INFO ] [2021/06/22 23:24:10,408] main - #1, waiting for 10  actions to finish 
on table: test
[INFO ] [2021/06/22 23:24:20,413] main - #1, waiting for 10  actions to finish 
on table: test
{code}







--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26026) HBase Write may be stuck forever when using CompactingMemStore

2021-06-23 Thread chenglei (Jira)
chenglei created HBASE-26026:


 Summary: HBase Write may be stuck forever when using 
CompactingMemStore
 Key: HBASE-26026
 URL: https://issues.apache.org/jira/browse/HBASE-26026
 Project: HBase
  Issue Type: Bug
  Components: in-memory-compaction
Affects Versions: 2.4.0, 2.3.0
Reporter: chenglei


Sometimes I observed that HBase Write might be stuck  in my hbase cluster which 
enabling {{CompactingMemStore}}.  I have simulated the problem  by unit test in 
my PR. The problem is caused by {{CompactingMemStore.checkAndAddToActiveSize}} 
: 
{code:java}
425   private boolean checkAndAddToActiveSize(MutableSegment currActive, Cell 
cellToAdd,
426  MemStoreSizing memstoreSizing) {
427if (shouldFlushInMemory(currActive, cellToAdd, memstoreSizing)) {
428  if (currActive.setInMemoryFlushed()) {
429flushInMemory(currActive);
430if (setInMemoryCompactionFlag()) {
431 // The thread is dispatched to do in-memory compaction in the 
background
  ..
 }
{code:java}
In line 427, if  the sum of {{currActive.getDataSize}} adding the size of 
{{cellToAdd}} exceeds {{CompactingMemStore.inmemoryFlushSize}}, then  
{{currActive}} should be flushed, {{MutableSegment.setInMemoryFlushed()}} is 
invoked in above line 428 :
{code:java}
public boolean setInMemoryFlushed() {
return flushed.compareAndSet(false, true);
  }
{code:java}
for above line 429 {{currActive.flushed}} is true, and 
{{CompactingMemStore.flushInMemory}} invokes 
{{CompactingMemStore.pushActiveToPipeline}} furthermore:
{code:java}
 protected void pushActiveToPipeline(MutableSegment currActive) {
if (!currActive.isEmpty()) {
  pipeline.pushHead(currActive);
  resetActive();
}
  }
{code:java}
For above {{CompactingMemStore.pushActiveToPipeline}} , if the 
{{currActive.cellSet}} is empty, then nothing is done. But due to  concurrent 
write and because we add cell size to
{{currActive.getDataSize}} and then add cell to {{currActive.cellSet}}, it is 
possible that {{currActive.getDataSize}} could not accommodate more cell but 
{{currActive.cellSet}} is empty because pending writes which not yet add cells 
to {{currActive.cellSet}}.
So now,  {{currActive.flushed}} is true,and new writes still continue target to 
{{currActive}}, but {{currActive}} could not enter {{flushInMemory}} again,no 
new active segment could be created, and in the end all writes would be stuck.

In my opinion , once  {{currActive.flushed}} is set true, it could not use as 
{{ActiveSegment}} again, and because of concurrent pending writes, only after 
{{currActive.updatesLock.writeLock()}} is acquired in 
{{CompactingMemStore.inMemoryCompaction}} ,we can safely check  {{currActive}}  
is empty or not.









--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26019) Remove reflections used in HBaseConfiguration.getPassword()

2021-06-23 Thread Peter Somogyi (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi resolved HBASE-26019.
---
Resolution: Fixed

> Remove reflections used in HBaseConfiguration.getPassword()
> ---
>
> Key: HBASE-26019
> URL: https://issues.apache.org/jira/browse/HBASE-26019
> Project: HBase
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> HBaseConfiguration.getPassword() uses Hadoop API Configuration.getPassword(). 
>  The API was added in Hadoop 2.6.0. Reflection was used to access the API. 
> It's time to remove the reflection and invoke the API directly. (HBase 3.0 as 
> well as 2.x too)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-26019) Remove reflections used in HBaseConfiguration.getPassword()

2021-06-23 Thread Peter Somogyi (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi reopened HBASE-26019:
---

Reopening to revert and reapply the commit. The merged commit does not contain 
the JIRA ID.

> Remove reflections used in HBaseConfiguration.getPassword()
> ---
>
> Key: HBASE-26019
> URL: https://issues.apache.org/jira/browse/HBASE-26019
> Project: HBase
>  Issue Type: Improvement
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
>
> HBaseConfiguration.getPassword() uses Hadoop API Configuration.getPassword(). 
>  The API was added in Hadoop 2.6.0. Reflection was used to access the API. 
> It's time to remove the reflection and invoke the API directly. (HBase 3.0 as 
> well as 2.x too)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-25934) Add username for RegionScannerHolder

2021-06-23 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-25934.
--
Hadoop Flags: Reviewed
  Resolution: Fixed

> Add username for RegionScannerHolder
> 
>
> Key: HBASE-25934
> URL: https://issues.apache.org/jira/browse/HBASE-25934
> Project: HBase
>  Issue Type: Wish
>Reporter: tomscut
>Assignee: tomscut
>Priority: Minor
> Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.5
>
>
> This JIRA[HBASE-25542|https://issues.apache.org/jira/browse/HBASE-25542] has 
> added part of the client information before, we can also add username for 
> RegionScannerHolder.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26010) Backport HBASE-25703 and HBASE-26002 to branch-2.3

2021-06-23 Thread Toshihiro Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki resolved HBASE-26010.
--
Resolution: Won't Fix

> Backport HBASE-25703 and HBASE-26002 to branch-2.3
> --
>
> Key: HBASE-26010
> URL: https://issues.apache.org/jira/browse/HBASE-26010
> Project: HBase
>  Issue Type: Improvement
>  Components: backport
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.3.6
>
>
> Backport HBASE-25703 "Support conditional update in MultiRowMutationEndpoint" 
> and HBASE-26002 "MultiRowMutationEndpoint should return the result of the 
> conditional update" to branch-2.3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-26009) Backport HBASE-25766 "Introduce RegionSplitRestriction that restricts the pattern of the split point" to branch-2.3

2021-06-23 Thread Toshihiro Suzuki (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki resolved HBASE-26009.
--
Resolution: Won't Fix

> Backport HBASE-25766 "Introduce RegionSplitRestriction that restricts the 
> pattern of the split point" to branch-2.3
> ---
>
> Key: HBASE-26009
> URL: https://issues.apache.org/jira/browse/HBASE-26009
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Fix For: 2.3.6
>
>
> Backport the parent issue to branch-2.3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26025) Add a flag to mark if the IOError can be solved by retry in thrift IOError

2021-06-23 Thread Yutong Xiao (Jira)
Yutong Xiao created HBASE-26025:
---

 Summary: Add a flag to mark if the IOError can be solved by retry 
in thrift IOError
 Key: HBASE-26025
 URL: https://issues.apache.org/jira/browse/HBASE-26025
 Project: HBase
  Issue Type: Improvement
Reporter: Yutong Xiao
Assignee: Yutong Xiao


Currently, if an HBaseIOException occurs, the thrift client can only get the 
error message. This is inconvenient for the client constructing a retry 
mechanism to handle the exception. So I added a canRetry mark in IOError to 
make the client side exception handling smarter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)