[jira] [Resolved] (HBASE-28595) Losing exception from scan RPC can lead to partial results

2024-05-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28595.
---
Fix Version/s: 2.4.18
   2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~csringhofer] for contributing and all others for helping and reviewing!

> Losing exception from scan RPC can lead to partial results
> --
>
> Key: HBASE-28595
> URL: https://issues.apache.org/jira/browse/HBASE-28595
> Project: HBase
>  Issue Type: Bug
>  Components: Client, regionserver, Scanners
>Reporter: Csaba Ringhofer
>Assignee: Csaba Ringhofer
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> This was discovered in Apache Impala using HBase 2.2 based branch hbase 
> client and server. It is not clear yet whether other branches are also 
> affected.
> The issue happens if the server side of the scan throws an exception and 
> closes the scanner, but at the same time, the client gets an rpc connection 
> closed error and doesn't process the exception sent by the server. Client 
> then thinks it got a network error, which leads to retrying the RPC instead 
> of opening a new scanner. But then when the client retry reaches the server, 
> the server returns an empty ScanResponse instead of an error, leading to 
> closing the scanner on client side without returning any error.
> A few pointers to critical parts:
> region server:
> 1st call throws exception leading to closing (but not deleting) scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539]
> 2nd call (retry of 1st) returns empty results:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403]
> client:
> some exceptions are handled as non-retriable at RPC level and are only 
> handled through opening a new scanner:
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214]
> [https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367]
> This mechanism in the client only works if it gets the exception from the 
> server. If there are connection issues during the RPC then the client won't 
> really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28536) Fix `Disable Stripe Compaction` run error in document

2024-05-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28536.
---
Fix Version/s: 4.0.0-alpha-1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Merged to master.

Thanks [~mrzhao] for contributing!

> Fix `Disable Stripe Compaction` run error in  document
> --
>
> Key: HBASE-28536
> URL: https://issues.apache.org/jira/browse/HBASE-28536
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 2.5.6
>Reporter: Moran
>Assignee: Moran
>Priority: Trivial
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>
> *Disable Stripe Compaction* in document is
> {code:java}
> alter 'orders_table', CONFIGURATION =>
> {'hbase.hstore.engine.class' => 
> 'rg.apache.hadoop.hbase.regionserver.DefaultStoreEngine'}{code}
> This should be 'org.apache.hadoop.hbase.regionserver.DefaultStoreEngine' 
> This will cause all regions to be in the openning state.Finally, I went 
> through the disable table and corrected it before enable it.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28604) Fix the error message in ReservoirSample's constructor

2024-05-18 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28604.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to branch-2.5+.

Thanks [~ndimiduk] for reviewing!

> Fix the error message in ReservoirSample's constructor
> --
>
> Key: HBASE-28604
> URL: https://issues.apache.org/jira/browse/HBASE-28604
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28606) [hbase-connectors] Support for build on mac M1

2024-05-18 Thread Nikita Pande (Jira)
Nikita Pande created HBASE-28606:


 Summary: [hbase-connectors] Support for build on mac M1
 Key: HBASE-28606
 URL: https://issues.apache.org/jira/browse/HBASE-28606
 Project: HBase
  Issue Type: Improvement
Reporter: Nikita Pande


[INFO] --- protobuf:0.6.1:compile (compile-protoc) @ hbase-spark-protocol ---
[ERROR] PROTOC FAILED: 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-25972) Dual File Compaction

2024-05-17 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell resolved HBASE-25972.
-
Fix Version/s: 2.6.1
   2.5.9
 Hadoop Flags: Reviewed
 Release Note: The default compactor in HBase compacts HFiles into one 
file. This change introduces a new store file writer which writes the retained 
cells by compaction into two files, which will be called DualFileWriter. One of 
these files will include the live cells. This file will be called a 
live-version file. The other file will include the rest of the cells, that is, 
historical versions. This file will be called a historical-version file. 
DualFileWriter will work with the default compactor. The historical files will 
not be read for the scans scanning latest row versions. This eliminates 
scanning unnecessary cell versions in compacted files and thus it is expected 
to improve performance of these scans.
   Resolution: Fixed

> Dual File Compaction
> 
>
> Key: HBASE-25972
> URL: https://issues.apache.org/jira/browse/HBASE-25972
> Project: HBase
>  Issue Type: Improvement
>Reporter: Kadir Ozdemir
>Assignee: Kadir Ozdemir
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> HBase stores tables row by row in its files, HFiles. An HFile is composed of 
> blocks. The number of rows stored in a block depends on the row sizes. The 
> number of rows per block gets lower when rows get larger on disk due to 
> multiple row versions since HBase stores all row versions sequentially in the 
> same HFile after compaction. However, applications (e.g., Phoenix) mostly 
> query the most recent row versions.
> The default compactor in HBase compacts HFiles into one file. This Jira 
> introduces a new store file writer which writes the retained cells by 
> compaction into two files, which will be called DualFileWriter. One of these 
> files will include the live cells. This file will be called a live-version 
> file. The other file will include the rest of the cells, that is, historical 
> versions. This file will be called a historical-version file. DualFileWriter 
> will work with the default compactor.
> The historical files will not be read for the scans scanning latest row 
> versions. This eliminates scanning unnecessary cell versions in compacted 
> files and thus it is expected to improve performance of these scans.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-26048) [JDK17] Replace the usage of deprecated API ThreadGroup.destroy()

2024-05-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26048.
---
Fix Version/s: 2.4.18
   2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~sunxin] for reviewing!

> [JDK17] Replace the usage of deprecated API ThreadGroup.destroy()
> -
>
> Key: HBASE-26048
> URL: https://issues.apache.org/jira/browse/HBASE-26048
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Reporter: Wei-Chiu Chuang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> According to the JDK17 doc, ThreadGroup.destroy() is deprecated because
> {quote}Deprecated, for removal: This API element is subject to removal in a 
> future version.
> {quote}
> The API and mechanism for destroying a ThreadGroup is inherently flawed. The 
> ability to explicitly or automatically destroy a thread group will be removed 
> in a future release.
> [https://download.java.net/java/early_access/jdk17/docs/api/java.base/java/lang/ThreadGroup.html#destroy(])
> We don't necessarily need to remove this usage now, but the warning sounds 
> bad enough.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28605) Add ErrorProne ban on Hadoop shaded thirdparty jars

2024-05-17 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28605:


 Summary: Add ErrorProne ban on Hadoop shaded thirdparty jars
 Key: HBASE-28605
 URL: https://issues.apache.org/jira/browse/HBASE-28605
 Project: HBase
  Issue Type: Task
  Components: build
Reporter: Nick Dimiduk


Over on HBASE-28568 we got tripped up because we pulled in the shaded Guava 
provided by Hadoop. This wasn't noticed until the backport to branch-2, which 
builds against hadoop-2. We should make this a compile time failure.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28598) NPE for writer object access in AsyncFSWAL#closeWriter

2024-05-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28598.
---
Fix Version/s: 2.4.18
   2.7.0
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all branch-2.x.

Thanks [~vineet.4008]!

> NPE for writer object access in AsyncFSWAL#closeWriter
> --
>
> Key: HBASE-28598
> URL: https://issues.apache.org/jira/browse/HBASE-28598
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 2.6.0, 2.4.18, 2.5.9
>Reporter: Vineet Kumar Maheshwari
>Assignee: Vineet Kumar Maheshwari
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 2.6.1, 2.5.9
>
>
> Observed NPE during execution of some of the UT cases.
> Exception is happening in AbstractFSWAL#closeWriter when trying to put null 
> writer object in inflightWALClosures map.
> Need to add null check for writer object in AsyncFSWAL#doShutdown function 
> before its usage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28604) Fix the error message in ReservoirSample's constructor

2024-05-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28604:
-

 Summary: Fix the error message in ReservoirSample's constructor
 Key: HBASE-28604
 URL: https://issues.apache.org/jira/browse/HBASE-28604
 Project: HBase
  Issue Type: Bug
  Components: util
Reporter: Duo Zhang
Assignee: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28579) Hide HFileScanner related methods in StoreFileReader

2024-05-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28579.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Reviewed
 Release Note: 
Change the below method in StoreFileReader from public to protected

HFileScanner getScanner(boolean, boolean)

And change to use StoreFileReader instead of HFileScanner in test code as much 
as possible.
   Resolution: Fixed

> Hide HFileScanner related methods in StoreFileReader
> 
>
> Key: HBASE-28579
> URL: https://issues.apache.org/jira/browse/HBASE-28579
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, Scanners
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28578) Remove deprecated methods in HFileScanner

2024-05-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28578.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Reviewed
 Release Note: Remove getKeyString and getValueString methods in 
HFileScanner.
   Resolution: Fixed

> Remove deprecated methods in HFileScanner
> -
>
> Key: HBASE-28578
> URL: https://issues.apache.org/jira/browse/HBASE-28578
> Project: HBase
>  Issue Type: Sub-task
>  Components: HFile, Scanners
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28572) Remove deprecated methods in thrift module

2024-05-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28572.
---
Hadoop Flags: Incompatible change,Reviewed
Release Note: 
Remove isTableAvailableWithSplit in thrift API.
Regenerate all the thrift java/python code in hbase-thrift and hbase-examples 
modules.
  Resolution: Fixed

> Remove deprecated methods in thrift module
> --
>
> Key: HBASE-28572
> URL: https://issues.apache.org/jira/browse/HBASE-28572
> Project: HBase
>  Issue Type: Sub-task
>  Components: Thrift
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28603) Finish 2.6.0 release

2024-05-17 Thread Bryan Beaudreault (Jira)
Bryan Beaudreault created HBASE-28603:
-

 Summary: Finish 2.6.0 release
 Key: HBASE-28603
 URL: https://issues.apache.org/jira/browse/HBASE-28603
 Project: HBase
  Issue Type: Sub-task
Reporter: Bryan Beaudreault


# Release the artifacts on repository.apache.org
 # Move the binaries from dist-dev to dist-release
 # Add xml to download page
 # Push tag 2.6.0RC4 as tag rel/2.6.0
 # Release 2.6.0 on JIRA 
[https://issues.apache.org/jira/projects/HBASE/versions/12353291]
 # Add release data on [https://reporter.apache.org/addrelease.html?hbase]
 # Send announcement email



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28237) Set version to 2.6.1-SNAPSHOT for branch-2.6

2024-05-17 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28237.
---
Resolution: Done

This is handled by automation so probably didn't need to be a jira

> Set version to 2.6.1-SNAPSHOT for branch-2.6
> 
>
> Key: HBASE-28237
> URL: https://issues.apache.org/jira/browse/HBASE-28237
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28233) Run ITBLL for branch-2.6

2024-05-17 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28233.
---
Resolution: Done

> Run ITBLL for branch-2.6
> 
>
> Key: HBASE-28233
> URL: https://issues.apache.org/jira/browse/HBASE-28233
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28235) Put up 2.6.0RC0

2024-05-17 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28235.
---
Resolution: Done

Ended up going to RC4, which has now passed

> Put up 2.6.0RC0
> ---
>
> Key: HBASE-28235
> URL: https://issues.apache.org/jira/browse/HBASE-28235
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28234) Set version as 2.6.0 in branch-2.6 in prep for first RC

2024-05-17 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28234.
---
Resolution: Done

> Set version as 2.6.0 in branch-2.6 in prep for first RC
> ---
>
> Key: HBASE-28234
> URL: https://issues.apache.org/jira/browse/HBASE-28234
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28602) Incremental backup fails when WALs move

2024-05-17 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28602:


 Summary: Incremental backup fails when WALs move
 Key: HBASE-28602
 URL: https://issues.apache.org/jira/browse/HBASE-28602
 Project: HBase
  Issue Type: Bug
  Components: backuprestore
Affects Versions: 3.0.0-beta-1, 2.6.0, 4.0.0-alpha-1, 2.7.0
Reporter: Nick Dimiduk


The incremental back process appears to collect a set of WAL files to operate 
over and then proceed to do so. In between a file moves. This causes the backup 
to fail. This is reproducible as a flakey unit test, as we see in 
TestIncrementalBackup.TestIncBackupRestore,

{noformat}
java.io.IOException: java.io.FileNotFoundException: File 
hdfs://localhost:39577/user/jenkins/test-data/f51646e4-e3e0-ef30-df2b-aa2a22ed41c3/WALs/94f4fe62ee7a,40249,1715620734331/94f4fe62ee7a%2C40249%2C1715620734331.94f4fe62ee7a%2C40249%2C1715620734331.regiongroup-0.1715620773674
 does not exist.
at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:289)
at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:595)
at 
org.apache.hadoop.hbase.backup.TestIncrementalBackup.TestIncBackupRestore(TestIncrementalBackup.java:169)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.io.FileNotFoundException: File 
hdfs://localhost:39577/user/jenkins/test-data/f51646e4-e3e0-ef30-df2b-aa2a22ed41c3/WALs/94f4fe62ee7a,40249,1715620734331/94f4fe62ee7a%2C40249%2C1715620734331.94f4fe62ee7a%2C40249%2C1715620734331.regiongroup-0.1715620773674
 does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1282)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.(DistributedFileSystem.java:1256)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1201)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1197)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:1215

[jira] [Resolved] (HBASE-26047) [JDK17] Track JDK17 unit test failures

2024-05-17 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26047.
---
Resolution: Fixed

Now we have JDK17 tests in pre commit for master branch, so I think we can 
resolve this issue now.

Feel free to reopen if you think we need to keep this open.

> [JDK17] Track JDK17 unit test failures
> --
>
> Key: HBASE-26047
> URL: https://issues.apache.org/jira/browse/HBASE-26047
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Wei-Chiu Chuang
>Assignee: Yutong Xiao
>Priority: Major
>
> As of now, there are still two failed unit tests after exporting JDK internal 
> modules and the modifier access hack.
> {noformat}
> [ERROR] Tests run: 7, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 0.217 
> s <<< FAILURE! - in org.apache.hadoop.hbase.io.TestHeapSize
> [ERROR] org.apache.hadoop.hbase.io.TestHeapSize.testSizes  Time elapsed: 
> 0.041 s  <<< FAILURE!
> java.lang.AssertionError: expected:<160> but was:<152>
> at 
> org.apache.hadoop.hbase.io.TestHeapSize.testSizes(TestHeapSize.java:335)
> [ERROR] org.apache.hadoop.hbase.io.TestHeapSize.testNativeSizes  Time 
> elapsed: 0.01 s  <<< FAILURE!
> java.lang.AssertionError: expected:<72> but was:<64>
> at 
> org.apache.hadoop.hbase.io.TestHeapSize.testNativeSizes(TestHeapSize.java:134)
> [INFO] Running org.apache.hadoop.hbase.io.Tes
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.697 
> s <<< FAILURE! - in org.apache.hadoop.hbase.ipc.TestBufferChain
> [ERROR] org.apache.hadoop.hbase.ipc.TestBufferChain.testWithSpy  Time 
> elapsed: 0.537 s  <<< ERROR!
> java.lang.NullPointerException: Cannot enter synchronized block because 
> "this.closeLock" is null
> at 
> org.apache.hadoop.hbase.ipc.TestBufferChain.testWithSpy(TestBufferChain.java:119)
> {noformat}
> It appears that JDK17 makes the heap size estimate different than before. Not 
> sure why.
> TestBufferChain.testWithSpy  failure might be because of yet another 
> unexported module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28601) Enable setting memcache on-heap sizes in bytes

2024-05-17 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28601:


 Summary: Enable setting memcache on-heap sizes in bytes
 Key: HBASE-28601
 URL: https://issues.apache.org/jira/browse/HBASE-28601
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Nick Dimiduk


Specifying blockcache and memcache sizes as a percentage of heap is not always 
ideal. Sometimes it's easier to specify exact values rather than backing into a 
percentage. Let's introduce new configuration settings (perhaps named similarly 
to {{hbase.bucketcache.size}}) that accept byte values. Even nicer would be if 
these settings accepted human-friendly byte values like {{512m}} or {{10g}}. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28600) Enable setting blockcache on-heap sizes in bytes

2024-05-17 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28600:


 Summary: Enable setting blockcache on-heap sizes in bytes
 Key: HBASE-28600
 URL: https://issues.apache.org/jira/browse/HBASE-28600
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Nick Dimiduk


Specifying blockcache and memcache sizes as a percentage of heap is not always 
ideal. Sometimes it's easier to specify exact values rather than backing into a 
percentage. Let's introduce new configuration settings (perhaps named similarly 
to {{hbase.bucketcache.size}}) that accept byte values. Even nicer would be if 
these settings accepted human-friendly byte values like {{512m}} or {{10g}}. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-26625) ExportSnapshot tool failed to copy data files for tables with merge region

2024-05-16 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-26625.
---
Resolution: Fixed

I've merged the backport to branch-2.5 and added the next unreleased 2.5.x 
version to fixVersions

> ExportSnapshot tool failed to copy data files for tables with merge region
> --
>
> Key: HBASE-26625
> URL: https://issues.apache.org/jira/browse/HBASE-26625
> Project: HBase
>  Issue Type: Bug
>Reporter: Yi Mei
>Assignee: Yi Mei
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.5.9, 2.4.10, 3.0.0-alpha-3
>
>
> When export snapshot for a table with merge regions, we found following 
> exceptions:
> {code:java}
> 2021-12-24 17:14:41,563 INFO  [main] snapshot.ExportSnapshot: Finalize the 
> Snapshot Export
> 2021-12-24 17:14:41,589 INFO  [main] snapshot.ExportSnapshot: Verify snapshot 
> integrity
> 2021-12-24 17:14:41,683 ERROR [main] snapshot.ExportSnapshot: Snapshot export 
> failed
> org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Missing parent 
> hfile for: 043a9fe8aa7c469d8324956a57849db5.8e935527eb39a2cf9bf0f596754b5853 
> path=A/a=t42=8e935527eb39a2cf9bf0f596754b5853-043a9fe8aa7c469d8324956a57849db5
>     at 
> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.concurrentVisitReferencedFiles(SnapshotReferenceUtil.java:232)
>     at 
> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.concurrentVisitReferencedFiles(SnapshotReferenceUtil.java:195)
>     at 
> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.verifySnapshot(SnapshotReferenceUtil.java:172)
>     at 
> org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.verifySnapshot(SnapshotReferenceUtil.java:156)
>     at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.verifySnapshot(ExportSnapshot.java:851)
>     at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.doWork(ExportSnapshot.java:1096)
>     at 
> org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:154)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at 
> org.apache.hadoop.hbase.util.AbstractHBaseTool.doStaticMain(AbstractHBaseTool.java:280)
>     at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:1144)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28599) RowTooBigException is thrown when duplicate increment RPC call is attempted

2024-05-16 Thread Robin Infant A (Jira)
Robin Infant A created HBASE-28599:
--

 Summary: RowTooBigException is thrown when duplicate increment RPC 
call is attempted
 Key: HBASE-28599
 URL: https://issues.apache.org/jira/browse/HBASE-28599
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.5.8, 2.5.7, 2.5.6, 2.5.5
Reporter: Robin Infant A
 Attachments: RowTooBig_trace.txt

*Issue:*
`RowTooBigException` is thrown when a duplicate increment RPC call is attempted.

*Expected Behavior:*
1. The initial RPC increment call should time out for some reason.
2. The duplicate RPC call should be converted to a GET request and fetch the 
result that I am trying to increment.
3. The result should contain only the qualifier that I am attempting to 
increment.

*Actual Behavior:*
1. The initial RPC increment call timed out, which is expected.
2. The duplicate RPC call is converted to a GET request but fails to clone the 
qualifier into the GET request.
3. Hence, the GET request attempts to retrieve all qualifiers for the given row 
and columnfamily, resulting in a `RowTooBigException`.

*Steps to Reproduce:*
1. Ensure a row with a total value size exceeding `hbase.table.max.rowsize` 
(default = 1073741824) exists.
2. Nonce property should be enabled `hbase.client.nonces.enabled` which is 
actually defaulted to true.
3. Attempt to increment a qualifier against the same row.
4. In my case, I am using a postIncrement co-processor which may cause a delay 
(longer than the RPC timeout property).
5. A duplicate increment call should be triggered, which tries to get the value 
rather than increment it.
6. The GET request actually tries to retrieve all the qualifiers for the row, 
resulting in a `RowTooBigException`.

*Insights:*
Upon further debugging, I found that qualifiers are not cloned into the GET 
instance due to incorrect usage of 
[CellScanner.advance|https://github.com/apache/hbase/blob/7ebd4381261fefd78fc2acf258a95184f4147cee/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L3833]

*Fix Suggestion:*
Removing the `!` operation from `while (!CellScanner.advance)` may resolve the 
issue.

Attached Exception Stack Trace for reference.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28553) SSLContext not used for Kerberos auth negotiation in rest client

2024-05-16 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved HBASE-28553.
-
Resolution: Duplicate

Fix included in HBASE-28501

> SSLContext not used for Kerberos auth negotiation in rest client
> 
>
> Key: HBASE-28553
> URL: https://issues.apache.org/jira/browse/HBASE-28553
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>
> The included REST client now supports specifying a Trust store for SSL 
> connections.
> However, the configured SSL library is not used when the Kerberos negotation 
> is performed by the Hadoop library, which uses its own client.
> We need to set up the Hadoop auth process to use the same SSLContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28598) NPE for writer object access in AsyncFSWAL#closeWriter

2024-05-16 Thread Vineet Kumar Maheshwari (Jira)
Vineet Kumar Maheshwari created HBASE-28598:
---

 Summary: NPE for writer object access in AsyncFSWAL#closeWriter
 Key: HBASE-28598
 URL: https://issues.apache.org/jira/browse/HBASE-28598
 Project: HBase
  Issue Type: Bug
  Components: wal
Affects Versions: 2.6.0, 2.4.18, 2.5.9
Reporter: Vineet Kumar Maheshwari


Observed NPE during execution of some of the UT cases.


Exception is happening in AbstractFSWAL#closeWriter when trying to put null 
writer object in inflightWALClosures map.

Need to add null check for writer object in AsyncFSWAL#doShutdown function 
before its usage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28597) Support native Cell format for protobuf in REST server and client

2024-05-16 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28597:
---

 Summary: Support native Cell format for protobuf in REST server 
and client
 Key: HBASE-28597
 URL: https://issues.apache.org/jira/browse/HBASE-28597
 Project: HBase
  Issue Type: Wish
  Components: REST
Reporter: Istvan Toth


REST currently uses its own (outdated) CellSetModel format for transferring 
cells.

This is fine for XML and JSON, which are slow anyway and even slower handling 
byte arrays, and is expected to be used in cases where a simple  client code 
which does not depend on the hbase java libraries is more important than raw 
performance.

However, we perform the same marshalling and unmarshalling when we are using 
protobuf, which doesn't really add value, but eats up resources.

We could add a new encoding for Results which uses the native cell format in 
protobuf, by simply dumping the binary cell bytestreams into the REST response 
body.

This should save a lot of resources on the server side, and would be either 
faster, or the same speed on the client.

As an additional advantage, the resulting Cells would be of native HBase Cell 
type instead of the REST Cell type.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27938) Enable PE to load any custom implementation of tests at runtime

2024-05-15 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HBASE-27938.
--
Fix Version/s: 2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Enable PE to load any custom implementation of tests at runtime
> ---
>
> Key: HBASE-27938
> URL: https://issues.apache.org/jira/browse/HBASE-27938
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Reporter: Prathyusha
>Assignee: Prathyusha
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> Right now to add any custom PE.Test implementation it has to have a compile 
> time dependency of those new test classes in PE, this is to enable PE to load 
> any custom impl of tests at runtime and utilise PE framework for any custom 
> implementations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28596) Optimise BucketCache usage upon regions splits/merges.

2024-05-15 Thread Wellington Chevreuil (Jira)
Wellington Chevreuil created HBASE-28596:


 Summary: Optimise BucketCache usage upon regions splits/merges.
 Key: HBASE-28596
 URL: https://issues.apache.org/jira/browse/HBASE-28596
 Project: HBase
  Issue Type: Improvement
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil


This proposal aims to give more flexibility for users to decide whether or not 
blocks from a parent region should be evict, and also optimise cache usage by 
resolving file reference blocks to the referred block in the cache.

Some extra context:

1) Originally, the default behaviour on splits was to rely on the 
"hbase.rs.evictblocksonclose" value to decide if the cached blocks from the 
parent split should be evicted or not. Then the resulting split daughters get 
open with refs to the parent file. If hbase.rs.prefetchblocksonopen is set, 
these openings will trigger a prefetch of the blocks from the parent split, now 
with cache keys from the ref path. That means, if "hbase.rs.evictblocksonclose" 
is false and “hbase.rs.prefetchblocksonopen” is true, we will be duplicating 
blocks in the cache. In scenarios where cache usage is at capacity and added 
latency for reading from the file system is high (for example reading from a 
cloud storage), this can have a severe impact, as the prefetch for the refs 
would trigger evictions. Also, the refs tend to be short lived, as compaction 
is triggered on the split daughters soon after it’s open.

2) HBASE-27474 has changed the original behaviour described above, to now 
always evict blocks from the split parent upon split is completed, and skipping 
prefetch for refs (since refs are short lived). The side effect is that the 
daughters blocks would only be cached once compaction is completed, but 
compaction itself will run slower since it needs to read the blocks from the 
file system. On regions as large as 20GB, the performance degradation reported 
by users has been severe.

This proposes a new “hbase.rs.evictblocksonsplit” configuration property that 
makes the eviction over split configurable. Depending on the use case, the 
impact of mass evictions due to cache capacity may be higher, in which case 
users might prefer to keep evicting split parent blocks. Additionally, it 
modifies the way we handle refs when caching. HBASE-27474 behaviour was to skip 
caching refs to avoid duplicate data in the cache as long as compaction was 
enabled, relying on the fact that refs from splits are usually short lived. 
Here, we propose modifying the search for blocks cache keys, so that we always 
resolve the referenced file first and look for the related referenced file 
block in the cache. That way we avoid duplicates in the cache and also expedite 
scan performance on the split daughters, as it’s now resolving the referenced 
file and reading from the cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28595) Loosing exception from scan RPC can lead to partial results

2024-05-15 Thread Csaba Ringhofer (Jira)
Csaba Ringhofer created HBASE-28595:
---

 Summary: Loosing exception from scan RPC can lead to partial 
results
 Key: HBASE-28595
 URL: https://issues.apache.org/jira/browse/HBASE-28595
 Project: HBase
  Issue Type: Bug
  Components: Client, regionserver, Scanners
Reporter: Csaba Ringhofer


This was discovered in Apache Impala using HBase 2.2 based branch hbase client 
and server. It is not clear yet whether other branches are also affected.

The issue happens if the server side of the scan throws an exception and closes 
the scanner, but the client doesn't get the exact exception and it treats it as 
network error, which leads to retrying the RPC instead of opening a new 
scanner. In this case  the server returns an empty ScanResponse instead of an 
error when the RPC is retried, leading to closing the scanner on client side 
without returning any error.

A few pointers to critical parts:
region server:
1st call throws exception leading to closing (but not deleting) scanner:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3539
2nd call (retry of 1st) returns empty results:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java#L3403

client:
some exceptions are handled as non-retriable at RPC level and are only handled 
through opening a new scanner:
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallable.java#L214
https://github.com/apache/hbase/blob/0c8607a35008b7dca15e9daaec41ec362d159d67/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java#L367

This mechanism in the client only works if it gets the exception from the 
server. If there are connection issues during the RPC then the client won't 
really know the state of the server.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28593) Update "Releasing Apache HBase" section in the book to document `do-release.sh`

2024-05-13 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28593:


 Summary: Update "Releasing Apache HBase" section in the book to 
document `do-release.sh`
 Key: HBASE-28593
 URL: https://issues.apache.org/jira/browse/HBASE-28593
 Project: HBase
  Issue Type: Task
  Components: community
Reporter: Nick Dimiduk


Manually rolling release candidates is a thing of the past. Let's update this 
section of the book to describe how to use the automation in 
{{dev-support/create-release}} and throw out these old manual instructions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28594) Add a new "Promoting Release Candidate"

2024-05-13 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28594:


 Summary: Add a new "Promoting Release Candidate"
 Key: HBASE-28594
 URL: https://issues.apache.org/jira/browse/HBASE-28594
 Project: HBase
  Issue Type: Task
Reporter: Nick Dimiduk






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28592) Backport HBASE-26525 Use unique thread name for group WALs

2024-05-13 Thread Szucs Villo (Jira)
Szucs Villo created HBASE-28592:
---

 Summary: Backport HBASE-26525 Use unique thread name for group WALs
 Key: HBASE-28592
 URL: https://issues.apache.org/jira/browse/HBASE-28592
 Project: HBase
  Issue Type: Sub-task
Reporter: Szucs Villo
Assignee: Szucs Villo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28591) Backport HBASE-26123 Restore fields dropped by HBASE-25986 to public interfaces

2024-05-13 Thread Szucs Villo (Jira)
Szucs Villo created HBASE-28591:
---

 Summary: Backport HBASE-26123 Restore fields dropped by 
HBASE-25986 to public interfaces
 Key: HBASE-28591
 URL: https://issues.apache.org/jira/browse/HBASE-28591
 Project: HBase
  Issue Type: Sub-task
Reporter: Szucs Villo
Assignee: Szucs Villo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28586) Backport HBASE-24791 Improve HFileOutputFormat2 to avoid always call getTableRelativePath method

2024-05-12 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28586.
---
Fix Version/s: 2.4.18
   2.7.0
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all branch-2.x.

Thanks [~szucsvillo]!

> Backport HBASE-24791 Improve HFileOutputFormat2 to avoid always call 
> getTableRelativePath method
> 
>
> Key: HBASE-28586
> URL: https://issues.apache.org/jira/browse/HBASE-28586
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Szucs Villo
>Assignee: Szucs Villo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 2.6.1, 2.5.9
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28581) Remove deprecated methods in TableDescriotorBuilder

2024-05-12 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28581.
---
Resolution: Fixed

Pushed to master and branch-3.

Thanks [~heliangjun] for contributing!

> Remove deprecated methods in TableDescriotorBuilder
> ---
>
> Key: HBASE-28581
> URL: https://issues.apache.org/jira/browse/HBASE-28581
> Project: HBase
>  Issue Type: Sub-task
>  Components: API, Client
>Reporter: Duo Zhang
>Assignee: Liangjun He
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28576) Remove FirstKeyValueMatchingQualifiersFilter

2024-05-12 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28576.
---
Fix Version/s: 3.0.0-beta-2
   Resolution: Fixed

Pushed to master and branch-3.

Thanks [~heliangjun] for contributing!

> Remove FirstKeyValueMatchingQualifiersFilter
> 
>
> Key: HBASE-28576
> URL: https://issues.apache.org/jira/browse/HBASE-28576
> Project: HBase
>  Issue Type: Sub-task
>  Components: Filters
>Reporter: Duo Zhang
>Assignee: Liangjun He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28590) NPE after upgrade from 2.5.8 to 3.0.0

2024-05-11 Thread Ke Han (Jira)
Ke Han created HBASE-28590:
--

 Summary: NPE after upgrade from 2.5.8 to 3.0.0
 Key: HBASE-28590
 URL: https://issues.apache.org/jira/browse/HBASE-28590
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 3.0.0
Reporter: Ke Han
 Attachments: commands.txt, hbase--master-fc906f1808de.log, 
persistent.tar.gz

When upgrade hbase cluster from 2.5.8 to 3.0.0 (commit: 516c89e8597fb6), I met 
the following NPE in master log.
{code:java}
2024-05-11T02:17:47,293 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: 
Unexpected throwable object 
java.lang.NullPointerException: null
        at 
org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463)
 ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
2024-05-11T02:17:47,326 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: 
Unexpected throwable object 
java.lang.NullPointerException: null
        at 
org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463)
 ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
2024-05-11T02:17:47,337 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000] ipc.RpcServer: 
Unexpected throwable object 
java.lang.NullPointerException: null
        at 
org.apache.hadoop.hbase.master.MasterRpcServices.reportFileArchival(MasterRpcServices.java:2578)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:16463)
 ~[hbase-protocol-shaded-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:443) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:124) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:102) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at org.apache.hadoop.hbase.ipc.RpcHandler.run(RpcHandler.java:82) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]{code}
h1. Reproduce

This bug cannot be reproduced deterministically but it happens pretty 
frequently (10% to trigger with the following steps.

1. Start up 2.5.8 cluster with default configuration (1 HM, 2 RS, 1 HDFS)

2. Execute the commands in commands.txt

3. Stop the 2.5.8 cluster and upgrade to 3.0.0 cluster with default 
configuration (commit: 516c89e8597fb6, 1 HM, 2 RS, 1 HDFS) 

The error message will occur in master log.

I attached (1) commands to reproduce it (2) master log and (3) full error logs 
of all nodes.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28589) Client Does not Stop Retrying after DoNotRetryException

2024-05-11 Thread ZhenyuLi (Jira)
ZhenyuLi created HBASE-28589:


 Summary: Client Does not Stop Retrying after DoNotRetryException
 Key: HBASE-28589
 URL: https://issues.apache.org/jira/browse/HBASE-28589
 Project: HBase
  Issue Type: Bug
  Components: IPC/RPC
Affects Versions: 2.0.0, 1.5.0, 1.4.0, 1.3.0, 1.2.0
Reporter: ZhenyuLi


I recently discovered that the fix for HBase-14598 does not completely resolve 
the issue. Their fix addressed two aspects: first, when the Scan/Get RPC 
attempts to allocate a very large array that could potentially lead to an 
out-of-memory (OOM) error, it will check the size of the array before 
allocation and directly throw an exception to prevent the region server from 
crashing and avoid possible cascading failures. Second, the developer intends 
for the client to stop retrying after such a failure, as retrying will not 
resolve the issue.

However, their fix involved throwing a DoNotRetryException. After 
ByteBufferOutputStream.write throws the DoNotRetryException, in the call stack 
(ByteBufferOutputStream.write --> encoder.write --> encodeCellsTo --> 
his.cellBlockBuilder.buildCellBlockStream --> call.setResponse), the 
DoNotRetryException is ultimately caught in the CallRunner.run function, with 
only a log printed. Consequently, the DoNotRetryException is not sent back to 
the client side. Instead, the client receives a generic exception for the 
failed RPC request and continues retrying, which is not the desired behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28588) Remove deprecated methods in WAL

2024-05-10 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28588:
-

 Summary: Remove deprecated methods in WAL
 Key: HBASE-28588
 URL: https://issues.apache.org/jira/browse/HBASE-28588
 Project: HBase
  Issue Type: Sub-task
  Components: wal
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28587) Remove deprecated methods in Cell

2024-05-10 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28587:
-

 Summary: Remove deprecated methods in Cell
 Key: HBASE-28587
 URL: https://issues.apache.org/jira/browse/HBASE-28587
 Project: HBase
  Issue Type: Sub-task
  Components: API, Client
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28586) Backport HBASE-24791 to branch-2.6

2024-05-10 Thread Szucs Villo (Jira)
Szucs Villo created HBASE-28586:
---

 Summary: Backport HBASE-24791 to branch-2.6
 Key: HBASE-28586
 URL: https://issues.apache.org/jira/browse/HBASE-28586
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Szucs Villo






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28585) It is advised that the copy_tables_desc.rb script should handle scenarios where the namespace does not exist in the target cluster during table replication.

2024-05-10 Thread wenhao (Jira)
wenhao created HBASE-28585:
--

 Summary: It is advised that the copy_tables_desc.rb script should 
handle scenarios where the namespace does not exist in the target cluster 
during table replication.
 Key: HBASE-28585
 URL: https://issues.apache.org/jira/browse/HBASE-28585
 Project: HBase
  Issue Type: Improvement
  Components: jruby
Affects Versions: 2.4.17
Reporter: wenhao


When utilizing the {{copy_tables_desc.rb}} script to duplicate tables to a 
target cluster, if the specified table's namespace is nonexistent in the target 
cluster, the script fails to execute successfully. It is recommended to 
incorporate logic within the script for detecting and handling scenarios where 
the namespace does not exist.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28448) CompressionTest hangs when run over a Ozone ofs path

2024-05-09 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HBASE-28448.
-
Resolution: Fixed

> CompressionTest hangs when run over a Ozone ofs path
> 
>
> Key: HBASE-28448
> URL: https://issues.apache.org/jira/browse/HBASE-28448
> Project: HBase
>  Issue Type: Bug
>Reporter: Pratyush Bhatt
>Assignee: Wei-Chiu Chuang
>Priority: Major
>  Labels: ozone, pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.1
>
> Attachments: hbase_ozone_compression.jstack
>
>
> If we run the Compression test over HDFS path, it works fine:
> {code:java}
> hbase org.apache.hadoop.hbase.util.CompressionTest 
> hdfs://ns1/tmp/dir1/dir2/test_file.txt snappy
> 24/03/20 06:08:43 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
> 24/03/20 06:08:43 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 24/03/20 06:08:43 INFO impl.MetricsSystemImpl: HBase metrics system started
> 24/03/20 06:08:43 INFO metrics.MetricRegistries: Loaded MetricRegistries 
> class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl
> 24/03/20 06:08:43 INFO compress.CodecPool: Got brand-new compressor [.snappy]
> 24/03/20 06:08:43 INFO compress.CodecPool: Got brand-new compressor [.snappy]
> 24/03/20 06:08:44 INFO compress.CodecPool: Got brand-new decompressor 
> [.snappy]
> SUCCESS {code}
> The command exits, but when the same is tried over a ofs path, the command 
> hangs.
> {code:java}
> hbase org.apache.hadoop.hbase.util.CompressionTest 
> ofs://ozone1710862004/test-222compression-vol/compression-buck2/test_file.txt 
> snappy
> 24/03/20 06:05:19 INFO protocolPB.OmTransportFactory: Loading OM transport 
> implementation 
> org.apache.hadoop.ozone.om.protocolPB.Hadoop3OmTransportFactory as specified 
> by configuration.
> 24/03/20 06:05:20 INFO client.ClientTrustManager: Loading certificates for 
> client.
> 24/03/20 06:05:20 WARN impl.MetricsConfig: Cannot locate configuration: tried 
> hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
> 24/03/20 06:05:20 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot 
> period at 10 second(s).
> 24/03/20 06:05:20 INFO impl.MetricsSystemImpl: HBase metrics system started
> 24/03/20 06:05:20 INFO metrics.MetricRegistries: Loaded MetricRegistries 
> class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl
> 24/03/20 06:05:20 INFO rpc.RpcClient: Creating Volume: 
> test-222compression-vol, with om as owner and space quota set to -1 bytes, 
> counts quota set to -1
> 24/03/20 06:05:20 INFO rpc.RpcClient: Creating Bucket: 
> test-222compression-vol/compression-buck2, with bucket layout 
> FILE_SYSTEM_OPTIMIZED, om as owner, Versioning false, Storage Type set to 
> DISK and Encryption set to false, Replication Type set to server-side default 
> replication type, Namespace Quota set to -1, Space Quota set to -1
> 24/03/20 06:05:21 INFO compress.CodecPool: Got brand-new compressor [.snappy]
> 24/03/20 06:05:21 INFO compress.CodecPool: Got brand-new compressor [.snappy]
> 24/03/20 06:05:21 WARN impl.MetricsSystemImpl: HBase metrics system already 
> initialized!
> 24/03/20 06:05:21 INFO metrics.MetricRegistries: Loaded MetricRegistries 
> class org.apache.ratis.metrics.dropwizard3.Dm3MetricRegistriesImpl
> 24/03/20 06:05:22 INFO compress.CodecPool: Got brand-new decompressor 
> [.snappy]
> SUCCESS 
> .
> .
> .{code}
> The command doesnt exit.
> Attaching the jstack of the process below:
> [^hbase_ozone_compression.jstack]
> cc: [~weichiu] 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28584) RS SIGSEGV under heavy replication load

2024-05-09 Thread Whitney Jackson (Jira)
Whitney Jackson created HBASE-28584:
---

 Summary: RS SIGSEGV under heavy replication load
 Key: HBASE-28584
 URL: https://issues.apache.org/jira/browse/HBASE-28584
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.5.6
 Environment: RHEL 7.9
JDK 11.0.23
Hadoop 3.2.4
Hbase 2.5.6
Reporter: Whitney Jackson


I'm observing RS crashes under heavy replication load:

 
{code:java}
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x7f7546873b69, pid=29890, tid=36828
#
# JRE version: Java(TM) SE Runtime Environment 18.9 (11.0.23+7) (build 
11.0.23+7-LTS-222)
# Java VM: Java HotSpot(TM) 64-Bit Server VM 18.9 (11.0.23+7-LTS-222, mixed 
mode, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# J 24625 c2 
org.apache.hadoop.hbase.util.ByteBufferUtils.copyBufferToStream(Ljava/io/OutputStream;Ljava/nio/ByteBuffer;II)V
 (75 bytes) @ 0x7f7546873b69 [0x7f7546873960+0x0209]
{code}
 

The heavier load comes when a replication peer has been disabled for several 
hours for patching etc. When the peer is re-enabled the replication load is 
high until the peer is all caught up. The crashes happen on the cluster 
receiving the replication edits:

 

I believe this problem started after upgrading from 2.4.x to 2.5.x.

 

One possibly relevant non-standard config I run with:
{code:java}

  hbase.region.store.parallel.put.limit
  
  100
  Added after seeing "failed to accept edits" replication errors 
in the destination region servers indicating this limit was being exceeded 
while trying to process replication edits.

{code}
 

I understand from other Jiras that the problem is likely around direct memory 
usage by Netty. I haven't yet tried switching the Netty allocator to 
{{unpooled}} or {{{}heap{}}}. I also haven't yet tried any of the  
{{io.netty.allocator.*}} options.

 

{{MaxDirectMemorySize}} is set to 26g.

 

Here's the full stack for the relevant thread:

 
{code:java}
Stack: [0x7f72e2e5f000,0x7f72e2f6],  sp=0x7f72e2f5e450,  free 
space=1021k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
J 24625 c2 
org.apache.hadoop.hbase.util.ByteBufferUtils.copyBufferToStream(Ljava/io/OutputStream;Ljava/nio/ByteBuffer;II)V
 (75 bytes) @ 0x7f7546873b69 [0x7f7546873960+0x0209]
J 26253 c2 
org.apache.hadoop.hbase.ByteBufferKeyValue.write(Ljava/io/OutputStream;Z)I (21 
bytes) @ 0x7f7545af2d84 [0x7f7545af2d20+0x0064]
J 22971 c2 
org.apache.hadoop.hbase.codec.KeyValueCodecWithTags$KeyValueEncoder.write(Lorg/apache/hadoop/hbase/Cell;)V
 (27 bytes) @ 0x7f754663f700 [0x7f754663f4c0+0x0240]
J 25251 c2 
org.apache.hadoop.hbase.ipc.NettyRpcDuplexHandler.write(Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelHandlerContext;Ljava/lang/Object;Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
 (90 bytes) @ 0x7f7546a53038 [0x7f7546a50e60+0x21d8]
J 21182 c2 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(Ljava/lang/Object;Lorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
 (73 bytes) @ 0x7f7545f4d90c [0x7f7545f4d3a0+0x056c]
J 21181 c2 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannelHandlerContext.write(Ljava/lang/Object;ZLorg/apache/hbase/thirdparty/io/netty/channel/ChannelPromise;)V
 (149 bytes) @ 0x7f7545fd680c [0x7f7545fd65e0+0x022c]
J 25389 c2 org.apache.hadoop.hbase.ipc.NettyRpcConnection$$Lambda$247.run()V 
(16 bytes) @ 0x7f7546ade660 [0x7f7546ade140+0x0520]
J 24098 c2 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(J)Z
 (109 bytes) @ 0x7f754678fbb8 [0x7f754678f8e0+0x02d8]
J 27297% c2 
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run()V (603 
bytes) @ 0x7f75466c4d48 [0x7f75466c4c80+0x00c8]
j  
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$4.run()V+44
j  
org.apache.hbase.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run()V+11
j  
org.apache.hbase.thirdparty.io.netty.util.concurrent.FastThreadLocalRunnable.run()V+4
J 12278 c1 java.lang.Thread.run()V java.base@11.0.23 (17 bytes) @ 
0x7f753e11f084 [0x7f753e11ef40+0x0144]
v  ~StubRoutines::call_stub
V  [libjvm.so+0x85574a]  JavaCalls::call_helper(JavaValue*, methodHandle 
const&, JavaCallArguments*, Thread*)+0x27a
V  [libjvm.so+0x853d2e]  JavaCalls::call_virtual(JavaValue*, Handle, Klass*, 
Symbol*, Symbol*, Thread*)+0x19e
V  [libjvm.so+0x8ffddf]  thread_entry(JavaThread*, Thread*)+0x9f
V  [libjvm.so+0xdb68d1]  JavaThread::thread_main_inner()+0x131
V  [libjvm.so+0xdb2c4c]  Thread::call_run()+0x13c
V  [libj

[jira] [Created] (HBASE-28583) Upgrade from 2.5.8 to 3.0 crash with InvalidProtocolBufferException: Message missing required fields: old_table_schema

2024-05-09 Thread Ke Han (Jira)
Ke Han created HBASE-28583:
--

 Summary: Upgrade from 2.5.8 to 3.0 crash with 
InvalidProtocolBufferException: Message missing required fields: 
old_table_schema
 Key: HBASE-28583
 URL: https://issues.apache.org/jira/browse/HBASE-28583
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 2.5.8, 3.0.0
Reporter: Ke Han
 Attachments: commands.txt, hbase--master-cc13b0df0f3a.log, 
persistent.tar.gz

When migrating data from 2.5.8 cluster (1HM, 2RS, 1 HDFS) to 3.0.0 (1 HM, 2 RS, 
2 HDFS), I met the following exception and the upgrade failed.

 
{code:java}
2024-05-09T20:16:20,638 ERROR [master/hmaster:16000:becomeActiveMaster] 
master.HMaster: Failed to become active master
org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: 
Message missing required fields: old_table_schema
        at 
org.apache.hbase.thirdparty.com.google.protobuf.UninitializedMessageException.asInvalidProtocolBufferException(UninitializedMessageException.java:56)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
        at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.checkMessageInitialized(AbstractParser.java:45)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
        at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:97)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
        at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:102)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
        at 
org.apache.hbase.thirdparty.com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:25)
 ~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
        at 
org.apache.hbase.thirdparty.com.google.protobuf.Any.unpack(Any.java:118) 
~[hbase-shaded-protobuf-4.1.7.jar:4.1.7]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureUtil$StateSerializer.deserialize(ProcedureUtil.java:125)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.procedure.RestoreSnapshotProcedure.deserializeStateData(RestoreSnapshotProcedure.java:303)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureUtil.convertToProcedure(ProcedureUtil.java:295)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.store.ProtoAndProcedure.getProcedure(ProtoAndProcedure.java:43)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.store.InMemoryProcedureIterator.next(InMemoryProcedureIterator.java:90)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.loadProcedures(ProcedureExecutor.java:517)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:80)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.load(ProcedureExecutor.java:344)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.store.region.RegionProcedureStore.load(RegionProcedureStore.java:287)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.load(ProcedureExecutor.java:335)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.init(ProcedureExecutor.java:666)
 ~[hbase-procedure-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.createProcedureExecutor(HMaster.java:1860)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1019)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2524)
 ~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:613) 
~[hbase-server-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at 
org.apache.hadoop.hbase.trace.TraceUtil.lambda$tracedRunnable$2(TraceUtil.java:155)
 ~[hbase-common-3.0.0-beta-2-SNAPSHOT.jar:3.0.0-beta-2-SNAPSHOT]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_362]
2024-05-09T20:16:20,639 ERROR [master/hmaster:16000:becomeActiveMaster] 
master.HMaster: * ABORTING master hmaster,16000,1715285771112: Unhandled 
exception. Starting shutdown. *
org.apache.hbase.thirdparty.com.google.protobuf.InvalidProtocolBufferException: 
Message missing

[jira] [Created] (HBASE-28582) ModifyTableProcedure should not reset TRSP on region node when closing unused region replicas

2024-05-09 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28582:
-

 Summary: ModifyTableProcedure should not reset TRSP on region node 
when closing unused region replicas
 Key: HBASE-28582
 URL: https://issues.apache.org/jira/browse/HBASE-28582
 Project: HBase
  Issue Type: Bug
  Components: proc-v2
Reporter: Duo Zhang
Assignee: Duo Zhang


Found this when digging HBASE-28522.

First, this is not safe as MTP does not like DTP where we hold the exclusive 
lock all the time.
Second, even if we hold the exclusive lock all the time, as showed in 
HBASE-28522, we may still hang there forever because SCP will not interrupt the 
TRSP.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28581) Remove deprecated methods in TableDescriptor

2024-05-08 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28581:
-

 Summary: Remove deprecated methods in TableDescriptor
 Key: HBASE-28581
 URL: https://issues.apache.org/jira/browse/HBASE-28581
 Project: HBase
  Issue Type: Sub-task
  Components: API, Client
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28580) Remove deprecated methods in WALObserver

2024-05-08 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28580:
-

 Summary: Remove deprecated methods in WALObserver
 Key: HBASE-28580
 URL: https://issues.apache.org/jira/browse/HBASE-28580
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors, wal
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28579) Hide HFileScanner related methods in StoreFileReader

2024-05-08 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28579:
-

 Summary: Hide HFileScanner related methods in StoreFileReader
 Key: HBASE-28579
 URL: https://issues.apache.org/jira/browse/HBASE-28579
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, Scanners
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28578) Remove deprecated methods in HFileScanner

2024-05-08 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28578:
-

 Summary: Remove deprecated methods in HFileScanner
 Key: HBASE-28578
 URL: https://issues.apache.org/jira/browse/HBASE-28578
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, Scanners
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28577) Remove deprecated methods in KeyValue

2024-05-08 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28577:
-

 Summary: Remove deprecated methods in KeyValue
 Key: HBASE-28577
 URL: https://issues.apache.org/jira/browse/HBASE-28577
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28576) Remove FirstKeyValueMatchingQualifiersFilter

2024-05-08 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28576:
-

 Summary: Remove FirstKeyValueMatchingQualifiersFilter
 Key: HBASE-28576
 URL: https://issues.apache.org/jira/browse/HBASE-28576
 Project: HBase
  Issue Type: Sub-task
  Components: Filters
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28563) Closing ZooKeeper in ZKMainServer

2024-05-08 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28563.
---
Fix Version/s: 2.4.18
   2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~minwoo.kang] for contributing and [~andor] for reviewing!

> Closing ZooKeeper in ZKMainServer
> -
>
> Key: HBASE-28563
> URL: https://issues.apache.org/jira/browse/HBASE-28563
> Project: HBase
>  Issue Type: Improvement
>  Components: Zookeeper
>Reporter: Minwoo Kang
>Assignee: Minwoo Kang
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> Users can switch the Zookeeper client/server communication framework to Netty.
> ZKMainServer process fails to terminate due to when users utilize Netty for 
> ZooKeeper connections.
> Netty threads identified as non-Daemon threads.
> Enforce the calling of close() on ZooKeeper before ZKMainServer termination.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28575) Always printing error log when snapshot table

2024-05-08 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28575.
---
Fix Version/s: 2.4.18
   2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~guluo] for contributing!

> Always printing error log when snapshot table 
> --
>
> Key: HBASE-28575
> URL: https://issues.apache.org/jira/browse/HBASE-28575
> Project: HBase
>  Issue Type: Bug
>  Components: snapshots
>Affects Versions: 2.4.13
> Environment: hbase2.4.13
> Centos7
>Reporter: guluo
>Assignee: guluo
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> Reproduction.
> 1. 
> Disable snapshot procedure if your hbase support snapshot procedure feature
> Set hbase.snapshot.procedure.enabled to false to disable snapshot procedure.
> 2.
> Executing snapshot against a table, this step is no problem
> snapshot 't01', 'sn0001'
> 3. 
> HBase outputs error logs, as follow.
> 2024-05-07T23:16:37,175 ERROR 
> [MASTER_SNAPSHOT_OPERATIONS-master/archlinux:16000-0] 
> snapshot.TakeSnapshotHandler: Couldn't delete snapshot working 
> directory:file:/opt/hbase/hbase-4.0.0-alpha-1-SNAPSHOT/tmp/hbase/.hbase-snapshot/.tmp/sn001
>  
> The Reason.
> HBase would clean tmp of the snapshot after snapshot. 
> The tmp would be empty if snapshot was executed successfully
> We would get false when calling `Filesystem.delete()` to delete the tmp which 
> does not exist, so hbase outputs error logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28570) Remove deprecated fields in HBTU

2024-05-08 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28570.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Reviewed
 Release Note: 
Remove TEST_DIRECTORY_KEY field in HBTU.

It is private so should not cause any compilation error.
   Resolution: Fixed

> Remove deprecated fields in HBTU
> 
>
> Key: HBASE-28570
> URL: https://issues.apache.org/jira/browse/HBASE-28570
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28575) Always printing error log when snapshot table

2024-05-07 Thread guluo (Jira)
guluo created HBASE-28575:
-

 Summary: Always printing error log when snapshot table 
 Key: HBASE-28575
 URL: https://issues.apache.org/jira/browse/HBASE-28575
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.4.13
 Environment: hbase2.4.13

Centos7
Reporter: guluo
Assignee: guluo


Reproduction.

1. 
Disable snapshot procedure if your hbase support snapshot procedure feature
Set hbase.snapshot.procedure.enabled to false to disable snapshot procedure.

2.
Executing snapshot against a table, this step is no problem
snapshot 't01', 'sn0001'

3. 
HBase outputs error logs, as follow.
2024-05-07T23:16:37,175 ERROR 
[MASTER_SNAPSHOT_OPERATIONS-master/archlinux:16000-0] 
snapshot.TakeSnapshotHandler: Couldn't delete snapshot working 
directory:file:/opt/hbase/hbase-4.0.0-alpha-1-SNAPSHOT/tmp/hbase/.hbase-snapshot/.tmp/sn001



 

The Reason.
HBase would clean tmp of the snapshot after snapshot. 
The tmp would be empty if snapshot was executed successfully
We would get false when calling `Filesystem.delete()` to delete the tmp which 
does not exist, so hbase outputs error logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28574) Bump jinja2 from 3.1.3 to 3.1.4 in /dev-support/flaky-tests

2024-05-07 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28574.
---
Fix Version/s: 2.4.18
   2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

> Bump jinja2 from 3.1.3 to 3.1.4 in /dev-support/flaky-tests
> ---
>
> Key: HBASE-28574
> URL: https://issues.apache.org/jira/browse/HBASE-28574
> Project: HBase
>  Issue Type: Task
>  Components: dependabot, scripts, security
>Reporter: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28574) Bump jinja2 from 3.1.3 to 3.1.4 in /dev-support/flaky-tests

2024-05-07 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28574:
-

 Summary: Bump jinja2 from 3.1.3 to 3.1.4 in 
/dev-support/flaky-tests
 Key: HBASE-28574
 URL: https://issues.apache.org/jira/browse/HBASE-28574
 Project: HBase
  Issue Type: Task
  Components: dependabot, scripts, security
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28459) HFileOutputFormat2 ClassCastException with s3 magic committer

2024-05-07 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28459.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to all active branches.

Thanks [~ksravista] for contributing!

> HFileOutputFormat2 ClassCastException with s3 magic committer
> -
>
> Key: HBASE-28459
> URL: https://issues.apache.org/jira/browse/HBASE-28459
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Assignee: Sravi Kommineni
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> In hadoop3 there's the s3 magic committer which can speed up s3 writes 
> dramatically. In HFileOutputFormat2.createRecordWriter we cast the passed in 
> committer as a FileOutputCommitter. This causes a class cast exception when 
> the s3 magic committer is enabled:
>  
> {code:java}
> Error: java.lang.ClassCastException: class 
> org.apache.hadoop.fs.s3a.commit.magic.MagicS3GuardCommitter cannot be cast to 
> class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter {code}
>  
> We can cast to PathOutputCommitter instead, but its only available in 
> hadoop3+. So we will need to use reflection to work around this in branch-2.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28573) Update compatibility report generator to ignore o.a.h.hbase.shaded packages

2024-05-07 Thread Nick Dimiduk (Jira)
Nick Dimiduk created HBASE-28573:


 Summary: Update compatibility report generator to ignore 
o.a.h.hbase.shaded packages
 Key: HBASE-28573
 URL: https://issues.apache.org/jira/browse/HBASE-28573
 Project: HBase
  Issue Type: Task
  Components: community
Reporter: Nick Dimiduk


This is a small change that will make reviewing release candidates a little 
easier. Right now that compatibility report includes classes that we shade. So 
when we shaded upgrade 3rd party dependencies, they show up in this report as 
an incompatible change. Changes to these classes do not affect users so there's 
no reason to consider them wrt compatibility. We should update the reporting 
tool to exclude this package.

For example, 
https://dist.apache.org/repos/dist/dev/hbase/2.6.0RC4/api_compare_2.5.0_to_2.6.0RC4.html#Binary_Removed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28556) Reduce memory copying in Rest server when serializing CellModel to Protobuf

2024-05-07 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved HBASE-28556.
-
Fix Version/s: 2.4.18
   3.0.0
   2.7.0
   2.6.1
   2.5.9
   Resolution: Fixed

Committed to all active branches.
Thanks for the review [~zhangduo].

> Reduce memory copying in Rest server when serializing CellModel to Protobuf
> ---
>
> Key: HBASE-28556
> URL: https://issues.apache.org/jira/browse/HBASE-28556
> Project: HBase
>  Issue Type: Improvement
>  Components: REST
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.4.18, 3.0.0, 2.7.0, 2.6.1, 2.5.9
>
>
> The REST server does a lot of unneccessary coping, which could be avoided at 
> least for protobuf encoding.
> - -It uses ByteStringer to handle ByteBuffer backed Cells. However, it uses 
> the client API, so it should never encounter ByteBuffer backed cells.-
> - It clones everything from the cells (sometimes multiple times) before 
> serializing to protbuf.
> We could mimic the structure in Cell, with array, offset and length for each 
> field, in CellModel and use the appropriate protobuf setters to avoid the 
> extra copies.
> There may or may not be a way to do the same for JSON and XML via jax-rs, I 
> don't know the frameworks well enough to tell, but if not, we could just do 
> the copying in the getters for them, which would not make things worse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28566) Remove ZKDataMigrator

2024-05-06 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28566.
---
Fix Version/s: 3.0.0-beta-2
 Hadoop Flags: Reviewed
 Release Note: Remove ZKDataMigrator.
   Resolution: Fixed

Pushed to master and branch-3.

Thanks [~meiyi] for reviewing!

> Remove ZKDataMigrator
> -
>
> Key: HBASE-28566
> URL: https://issues.apache.org/jira/browse/HBASE-28566
> Project: HBase
>  Issue Type: Sub-task
>  Components: Zookeeper
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28572) Remove deprecated methods in thrift module

2024-05-06 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28572:
-

 Summary: Remove deprecated methods in thrift module
 Key: HBASE-28572
 URL: https://issues.apache.org/jira/browse/HBASE-28572
 Project: HBase
  Issue Type: Sub-task
  Components: Thrift
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28571) Remove deprecated methods map reduce utils

2024-05-06 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28571:
-

 Summary: Remove deprecated methods map reduce utils
 Key: HBASE-28571
 URL: https://issues.apache.org/jira/browse/HBASE-28571
 Project: HBase
  Issue Type: Sub-task
  Components: mapreduce
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28570) Remove deprecated methods in HBTU

2024-05-06 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28570:
-

 Summary: Remove deprecated methods in HBTU
 Key: HBASE-28570
 URL: https://issues.apache.org/jira/browse/HBASE-28570
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28569) Race condition during WAL splitting leading to corrupt recovered.edits

2024-05-06 Thread Benoit Sigoure (Jira)
Benoit Sigoure created HBASE-28569:
--

 Summary: Race condition during WAL splitting leading to corrupt 
recovered.edits
 Key: HBASE-28569
 URL: https://issues.apache.org/jira/browse/HBASE-28569
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 2.4.17
Reporter: Benoit Sigoure


There is a race condition that can happen when a regionserver aborts 
initialisation while splitting a WAL from another regionserver. This race leads 
to writing the WAL trailer for recovered edits while the writer threads are 
still running, thus the trailer gets interleaved with the edits corrupting the 
recovered edits file (and preventing the region to be assigned).
We've seen this happening on HBase 2.4.17, but looking at the latest code it 
seems that the race can still happen there.
The sequence of operations that leads to this issue: * 
{{org.apache.hadoop.hbase.wal.WALSplitter.splitWAL}} calls 
{{outputSink.close()}} after adding all the entries to the buffers
 * The output sink is {{org.apache.hadoop.hbase.wal.RecoveredEditsOutputSink}} 
and its {{close}} method calls first {{finishWriterThreads}} in a try block 
which in turn will call {{finish}} on every thread and then join it to make 
sure it's done.
 * However if the splitter thread gets interrupted because of RS aborting, the 
join will get interrupted and {{finishWriterThreads}} will rethrow without 
waiting for the writer threads to stop.
 * This is problematic because coming back to 
{{org.apache.hadoop.hbase.wal.RecoveredEditsOutputSink.close}} it will call 
{{closeWriters}} in a finally block (so it will execute even when the join was 
interrupted).
 * {{closeWriters}} will call 
{{org.apache.hadoop.hbase.wal.AbstractRecoveredEditsOutputSink.closeRecoveredEditsWriter}}
 which will call {{close}} on {{{}editWriter.writer{}}}.
 * When {{editWriter.writer}} is 
{{{}org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter{}}}, its 
{{close}} method will write the trailer before closing the file.
 * This trailer write will now go in parallel with writer threads writing 
entries causing corruption.
 * If there are no other errors, {{closeWriters}} will succeed renaming all 
temporary files to final recovered edits, causing problems next time the region 
is assigned.

Logs evidence supporting the above flow:
Abort is triggered (because it failed to open the WAL due to some ongoing infra 
issue):
{noformat}
regionserver-2 regionserver 06:22:00.384 
[RS_OPEN_META-regionserver/host01:16201-0] ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer - * ABORTING region 
server host01,16201,1709187641249: WAL can not clean up after init failed 
*{noformat}

We can see that the writer threads were still active after closing (even 
considering that the
ordering in the log might not be accurate, we see that they die because the 
channel is closed while still writing, not because they're stopping):
{noformat}
regionserver-2 regionserver 06:22:09.662 [DataStreamer for file 
/hbase/data/default/aeris_v2/53308260a6b22eaf6ebb8353f7df3077/recovered.edits/03169600719-host02%2C16201%2C1709180140645.1709186722780.temp
 block BP-1645452845-192.168.2.230-1615455682886:blk_1076340939_2645368] WARN  
org.apache.hadoop.hdfs.DataStreamer - Error Recovery for 
BP-1645452845-192.168.2.230-1615455682886:blk_1076340939_2645368 in pipeline 
[DatanodeInfoWithStorage[192.168.2.230:15010,DS-2aa201ab-1027-47ec-b05f-b39d795fda85,DISK],
 
DatanodeInfoWithStorage[192.168.2.232:15010,DS-39651d5a-67d2-4126-88f0-45cdee967dab,DISK],
 Datanode
InfoWithStorage[192.168.2.231:15010,DS-e08a1d17-f7b1-4e39-9713-9706bd762f48,DISK]]:
 datanode 
2(DatanodeInfoWithStorage[192.168.2.231:15010,DS-e08a1d17-f7b1-4e39-9713-9706bd762f48,DISK])
 is bad.
regionserver-2 regionserver 06:22:09.742 [split-log-closeStream-pool-1] INFO  
org.apache.hadoop.hbase.wal.RecoveredEditsOutputSink - Closed recovered edits 
writer 
path=hdfs://mycluster/hbase/data/default/aeris_v2/53308260a6b22eaf6ebb8353f7df3077/recovered.edits/03169600719-host02%2C16201%
2C1709180140645.1709186722780.temp (wrote 5949 edits, skipped 0 edits in 93 ms)
regionserver-2 regionserver 06:22:09.743 
[RS_LOG_REPLAY_OPS-regionserver/host01:16201-1-Writer-0] ERROR 
org.apache.hadoop.hbase.wal.RecoveredEditsOutputSink - Failed to write log 
entry aeris_v2/53308260a6b22eaf6ebb8353f7df3077/3169611655=[#edits: 8 = 
] to log
regionserver-2 regionserver java.nio.channels.ClosedChannelException: null
regionserver-2 regionserver    at 
org.apache.hadoop.hdfs.ExceptionLastSeen.throwException4Close(ExceptionLastSeen.java:73)
 ~[hadoop-hdfs-client-3.2.4.jar:?]
regionserver-2 regionserver    at 
org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:153) 
~[hadoop-hdfs-client-3.2.4.jar:?]
regionserver-2 regionserver    at 
org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java

[jira] [Created] (HBASE-28568) Incremental backup set does not correctly shrink

2024-05-06 Thread Dieter De Paepe (Jira)
Dieter De Paepe created HBASE-28568:
---

 Summary: Incremental backup set does not correctly shrink
 Key: HBASE-28568
 URL: https://issues.apache.org/jira/browse/HBASE-28568
 Project: HBase
  Issue Type: Bug
  Components: backuprestore
Affects Versions: 2.6.0, 3.0.0
Reporter: Dieter De Paepe


The logic in BackupAdminImpl#finalizeDelete does not properly clean up tables 
from the incrementalBackupTableSet (= the set of backups to include in every 
incremental backup).

This can lead to backups failing.

 

Minimal example to reproduce from source:
 * Add following to `conf/hbase-site.xml` to enable backups:

{code:java}

hbase.backup.enable
true
  
  
hbase.master.logcleaner.plugins

org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner
  
  
hbase.procedure.master.classes

org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager
  
  
hbase.procedure.regionserver.classes

org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager
  
  
  hbase.coprocessor.region.classes
  org.apache.hadoop.hbase.backup.BackupObserver

  
hbase.fs.tmp.dir
file:/tmp/hbase-tmp
   {code}
 * Start HBase: {{bin/start-hbase.sh}}
 * 
{code:java}
echo "create 'table1', 'cf'" | bin/hbase shell -n
echo "create 'table2', 'cf'" | bin/hbase shell -nbin/hbase backup create full 
file:/tmp/hbasebackups -t table1
bin/hbase backup create full file:/tmp/hbasebackups -t table2
bin/hbase backup create incremental file:/tmp/hbasebackups
# Deletes the 2 most recent backups
bin/hbase backup delete -l $(bin/hbase backup history | head -n1  | tail -n -1 
| grep -o -P "backup_\d+"),$(bin/hbase backup history | head -n2  | tail -n -1 
| grep -o -P "backup_\d+")
bin/hbase backup create incremental file:/tmp/hbasebackups -t table1

[...]
2024-05-06T14:28:46,420 INFO  [main {}] mapreduce.MapReduceBackupCopyJob: 
Progress: 100.0% subTask: 1.0 mapProgress: 1.0
2024-05-06T14:28:46,468 ERROR [main {}] backup.BackupDriver: Error running 
command-line tool
java.lang.IllegalStateException: Unable to find full backup that contains 
tables: [table2]
    at 
org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:323)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:336)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.TableBackupClient.addManifest(TableBackupClient.java:286)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.TableBackupClient.completeBackup(TableBackupClient.java:351)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:313)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:345)
 ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) 
~[hadoop-common-3.3.5.jar:?]
    at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177) 
~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT]
{code}

PR will follow soon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28567) Race condition causes MetaRegionLocationCache to never set watcher to populate meta location

2024-05-06 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28567.
---
Fix Version/s: 2.4.18
   2.7.0
   3.0.0-beta-2
   2.6.1
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~vincentpoon] for contributing and [~vjasani] for reviewing!

> Race condition causes MetaRegionLocationCache to never set watcher to 
> populate meta location
> 
>
> Key: HBASE-28567
> URL: https://issues.apache.org/jira/browse/HBASE-28567
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.5.8
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> {{ZKWatcher#getMetaReplicaNodesAndWatchChildren()}} attempts to set a a watch 
> on the base /hbase znode children using 
> {{ZKUtil.listChildrenAndWatchForNewChildren()}}, but if the node does not 
> exist, no watch gets set.
> We've seen this in the test container Trino uses over at 
> [trino/21569|https://github.com/trinodb/trino/pull/21569] , where ZK, master, 
> and RS are all run in the same container.
> The fix is to throw if the node does not exist so that 
> {{MetaRegionLocationCache}} can retry until the node gets created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28567) Race condition causes MetaRegionLocationCache to never set watcher to populate meta location

2024-05-05 Thread Vincent Poon (Jira)
Vincent Poon created HBASE-28567:


 Summary: Race condition causes MetaRegionLocationCache to never 
set watcher to populate meta location
 Key: HBASE-28567
 URL: https://issues.apache.org/jira/browse/HBASE-28567
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.5.8, 3.0.0
Reporter: Vincent Poon
Assignee: Vincent Poon


{{ZKWatcher#getMetaReplicaNodesAndWatchChildren()}} attempts to set a a watch 
on the base /hbase znode children using 
{{ZKUtil.listChildrenAndWatchForNewChildren()}}, but if the node does not 
exist, no watch gets set.

We've seen this in the test container Trino uses over at 
[trino/21569|https://github.com/trinodb/trino/pull/21569] , where ZK, master, 
and RS are all run in the same container.
The fix is to throw if the node does not exist so that 
{{MetaRegionLocationCache}} can retry until the node gets created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28566) Remove ZKDataMigrator

2024-05-05 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28566:
-

 Summary: Remove ZKDataMigrator
 Key: HBASE-28566
 URL: https://issues.apache.org/jira/browse/HBASE-28566
 Project: HBase
  Issue Type: Sub-task
  Components: Zookeeper
Reporter: Duo Zhang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28565) Make VerifyReplication accept connection uri when specifying peer cluster

2024-05-04 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28565:
-

 Summary: Make VerifyReplication accept connection uri when 
specifying peer cluster
 Key: HBASE-28565
 URL: https://issues.apache.org/jira/browse/HBASE-28565
 Project: HBase
  Issue Type: Improvement
  Components: mapreduce, Replication
Reporter: Duo Zhang
Assignee: Duo Zhang
 Fix For: 3.0.0-beta-2






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28564) Refactor direct interactions of Reference file creations to SFT interface

2024-05-03 Thread Prathyusha (Jira)
Prathyusha created HBASE-28564:
--

 Summary: Refactor direct interactions of Reference file creations 
to SFT interface
 Key: HBASE-28564
 URL: https://issues.apache.org/jira/browse/HBASE-28564
 Project: HBase
  Issue Type: Improvement
Reporter: Prathyusha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28563) Closing ZooKeeper in ZKMainServer

2024-05-02 Thread Minwoo Kang (Jira)
Minwoo Kang created HBASE-28563:
---

 Summary: Closing ZooKeeper in ZKMainServer
 Key: HBASE-28563
 URL: https://issues.apache.org/jira/browse/HBASE-28563
 Project: HBase
  Issue Type: Improvement
Reporter: Minwoo Kang


Users can switch the Zookeeper client/server communication framework to Netty.
ZKMainServer process fails to terminate due to when users utilize Netty for 
ZooKeeper connections.
Netty threads identified as non-Daemon threads.
Enforce the calling of close() on ZooKeeper before ZKMainServer termination.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28562) Ancestor calculation of backups is wrong

2024-05-02 Thread Dieter De Paepe (Jira)
Dieter De Paepe created HBASE-28562:
---

 Summary: Ancestor calculation of backups is wrong
 Key: HBASE-28562
 URL: https://issues.apache.org/jira/browse/HBASE-28562
 Project: HBase
  Issue Type: Bug
  Components: backuprestore
Affects Versions: 2.6.0, 3.0.0
Reporter: Dieter De Paepe


This is the same issue as HBASE-25870, but I think the fix there was wrong.

This issue can prevent creation of (incremental) backups when data of unrelated 
backups was damaged on backup storage.

Minimal example to reproduce from source:
 * Add following to `conf/hbase-site.xml` to enable backups:

{code:java}

hbase.backup.enable
true
  
  
hbase.master.logcleaner.plugins

org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner
  
  
hbase.procedure.master.classes

org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager
  
  
hbase.procedure.regionserver.classes

org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager
  
  
  hbase.coprocessor.region.classes
  org.apache.hadoop.hbase.backup.BackupObserver

  
hbase.fs.tmp.dir
file:/tmp/hbase-tmp
   {code}
 * Start HBase and open a shell: {{{}bin/start-hbase.sh{}}}, {{bin/hbase shell}}
 * Execute following commands ("put" & "create" commands in hbase shell, other 
commands in commandline):
 * 
{code:java}
create 'experiment', 'fam' 
put 'experiment', 'row1', 'fam:b', 'value1'
bin/hbase backup create full file:/tmp/hbasebackup
Backup session backup_1714649896776 finished. Status: SUCCESS

put 'experiment', 'row2', 'fam:b', 'value2'
bin/hbase backup create incremental file:/tmp/hbasebackup
Backup session backup_1714649920488 finished. Status: SUCCESS

put 'experiment', 'row3', 'fam:b', 'value3'
bin/hbase backup create incremental file:/tmp/hbasebackup
Backup session backup_1714650054960 finished. Status: SUCCESS

(Delete the files corresponding to the first incremental backup - 
backup_1714649920488 in this example)

put 'experiment', 'row4', 'fam:a', 'value4'
bin/hbase backup create full file:/tmp/hbasebackup
Backup session backup_1714650236911 finished. Status: SUCCESS

put 'experiment', 'row5', 'fam:a', 'value5'
bin/hbase backup create incremental file:/tmp/hbasebackup
Backup session backup_1714650289957 finished. Status: SUCCESS

put 'experiment', 'row6', 'fam:a', 'value6'
bin/hbase backup create incremental 
file:/tmp/hbasebackup2024-05-02T13:45:27,534 ERROR [main {}] 
impl.BackupManifest: file:/tmp/hbasebackup/backup_1714649920488 does not exist
2024-05-02T13:45:27,534 ERROR [main {}] impl.TableBackupClient: Unexpected 
Exception : file:/tmp/hbasebackup/backup_1714649920488 does not exist
org.apache.hadoop.hbase.backup.impl.BackupException: 
file:/tmp/hbasebackup/backup_1714649920488 does not exist
    at 
org.apache.hadoop.hbase.backup.impl.BackupManifest.(BackupManifest.java:451)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupManifest.(BackupManifest.java:402)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:331)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:353)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.TableBackupClient.addManifest(TableBackupClient.java:286)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.TableBackupClient.completeBackup(TableBackupClient.java:351)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:314)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:345)
 ~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134) 
~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at 
org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169) 
~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199) 
~[hbase-backup-2.6.1-SNAPSHOT.jar:2.6.1-SNAPSHOT]
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) 
~[hadoop-common-3.3.5.jar:?]
    at org.apache.hadoop.hbase.

[jira] [Resolved] (HBASE-28535) Implement a region server level configuration to enable/disable data-tiering

2024-05-02 Thread Wellington Chevreuil (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil resolved HBASE-28535.
--
Resolution: Fixed

Merged into the feature branch. Thanks for the contribution, 
[~janardhan.hungund] !

> Implement a region server level configuration to enable/disable data-tiering
> 
>
> Key: HBASE-28535
> URL: https://issues.apache.org/jira/browse/HBASE-28535
> Project: HBase
>  Issue Type: Task
>  Components: BucketCache
>Reporter: Janardhan Hungund
>Assignee: Janardhan Hungund
>Priority: Major
>  Labels: pull-request-available
>
> Provide the user with the ability to enable and disable the data tiering 
> feature. The time-based data tiering is applicable to a specific set of use 
> cases which write date based records and access to recently written data.
> The feature, in general, should be avoided for use cases which are not 
> dependent on the date-based reads and writes as the code flows which enable 
> data temperature checks can induce performance regressions.
> This Jira is added to track the functionality to optionally enable 
> region-server wide configuration to disable or enable the feature.
> Thanks,
> Janardhan



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28561) Add separate fields for column family and qualifier in REST message format

2024-05-01 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28561:
---

 Summary: Add separate fields for column family and qualifier in 
REST message format
 Key: HBASE-28561
 URL: https://issues.apache.org/jira/browse/HBASE-28561
 Project: HBase
  Issue Type: Improvement
  Components: REST
Reporter: Istvan Toth


The current format uses the archaic column field, which requires extra 
processing and copying at both the server and client side.

We need to:
- Add a version field to the requests, to be enabled by clients that support 
the new format
- Add the new fields to the JSON, XML and protobuf formats, and logic to use 
them.

This should be doable in a backwards-compatible manner, with the server falling 
back to the old format if it receives an unversioned request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28521) Use standard ConnectionRegistry and Client API to get region server list in in replication

2024-05-01 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28521.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to branch-3+.

Thanks [~zghao] and [~andor]!

> Use standard ConnectionRegistry and Client API to get region server list in 
> in replication
> --
>
> Key: HBASE-28521
> URL: https://issues.apache.org/jira/browse/HBASE-28521
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0-beta-2
>
>
> This is for allowing specify cluster key without zookeeper in replication 
> peer config.
> For now, we will set a watcher on zookeeper for fetching the region server 
> list for the remote cluster, this means we must know the zookeeper address of 
> the remote cluster. This should be fixed as we do not want to leak the 
> zookeeper outside the cluster itself.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28555) ThriftConnection does not need ConnectionRegistry

2024-05-01 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28555.
---
Fix Version/s: (was: 2.7.0)
   Resolution: Duplicate

> ThriftConnection does not need ConnectionRegistry
> -
>
> Key: HBASE-28555
> URL: https://issues.apache.org/jira/browse/HBASE-28555
> Project: HBase
>  Issue Type: Bug
>  Components: Client, Thrift
>Reporter: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28558) Fix constructors for sub classes of Connection

2024-05-01 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28558.
---
Fix Version/s: 2.7.0
   3.0.0-beta-2
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to branch-2+.

Thanks [~zghao] and [~GeorryHuang] for reviewing!

> Fix constructors for sub classes of Connection
> --
>
> Key: HBASE-28558
> URL: https://issues.apache.org/jira/browse/HBASE-28558
> Project: HBase
>  Issue Type: Bug
>  Components: Client, test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2
>
>
> We still have some testing classes which implement Connection but do not have 
> constructors with ConnectionRegistry as parameter.
> This lead to some test failures on branch-2.x, but we'd better also fix them 
> on master as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28523) Use a single get call in REST multiget endpoint

2024-04-30 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved HBASE-28523.
-
Resolution: Fixed

Committed to all active branches.

> Use a single get call in REST multiget endpoint
> ---
>
> Key: HBASE-28523
> URL: https://issues.apache.org/jira/browse/HBASE-28523
> Project: HBase
>  Issue Type: Improvement
>  Components: REST
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 2.4.18, 2.7.0, 3.0.0-beta-2, 2.6.1, 2.5.9
>
>
> The REST multiget endpoint currently issues a separate HBase GET operation 
> for each key.
> Use the method that accepts a list of keys instead.
> That should be faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28560) Region quotas: Split/merge procedure rollback can lead to inaccurate account of region counts

2024-04-29 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28560:
-

 Summary: Region quotas: Split/merge procedure rollback can lead to 
inaccurate account of region counts
 Key: HBASE-28560
 URL: https://issues.apache.org/jira/browse/HBASE-28560
 Project: HBase
  Issue Type: Bug
Affects Versions: 3.0.0-beta-2
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky


When region quotas are enabled, HMaster keeps an in memory account of region 
counts through NamespaceStateManager. Region counts in NamespaceStateManager 
are incremented/decremented at the beginning stages of split/merge procedures, 
in SPLIT_TABLE_REGION_PRE_OPERATION/MERGE_TABLE_REGIONS_PRE_MERGE_OPERATION 
before any region is offlined. If the split/merge procedure gets rolled back 
after the region count change in NamespaceStateManager is made, the split/merge 
procedure rollback does not revert the region count change in 
NamespaceStateManager to reflect that the expected split/merge never succeeded. 
This leaves NamespaceStateManager with an inaccurate account of the number of 
regions, believing that there are more/less regions than actually exist.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28559) Region quotas: Multi-region merge causes inaccurate accounting of region counts

2024-04-29 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28559:
-

 Summary: Region quotas: Multi-region merge causes inaccurate 
accounting of region counts
 Key: HBASE-28559
 URL: https://issues.apache.org/jira/browse/HBASE-28559
 Project: HBase
  Issue Type: Bug
  Components: Quotas
Affects Versions: 3.0.0-beta-2
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky


There is support for merging more than two regions in one merge procedure with 
multi-region merge, but if region quotas are enabled, [NamespaceAuditor assumes 
that every merge is a two region 
merge|https://github.com/apache/hbase/blob/branch-3/hbase-server/src/main/java/org/apache/hadoop/hbase/namespace/NamespaceAuditor.java#L128-L129].
 This causes an inaccurate in memory accounting of region counts in 
NamespaceStateManager, leading MasterQuotaManager to believe there are more 
regions than actually exist if multi-region merge is used. 

To demonstrate the issue:
1. Start with a table with 3 regions in a namespace with a region quota limit 
of 3
2. Merge all 3 regions leaving 1 region, NamespaceAuditor assumed it was a 2 
region merge and believes the number of regions to be 2.
3. Split a region, number of regions is now 2, and NamespaceAuditor believe it 
to be 3.
4. Attempt another region split, which will fail because NamespaceAuditor 
believes the ns to be at its region limit of 3 when there are actually only 2 
regions. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28558) Fix constructors for sub class of Connection

2024-04-29 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28558:
-

 Summary: Fix constructors for sub class of Connection
 Key: HBASE-28558
 URL: https://issues.apache.org/jira/browse/HBASE-28558
 Project: HBase
  Issue Type: Bug
  Components: Client, test
Reporter: Duo Zhang


We still have some testing classes which implement Connection but do not have 
constructors with ConnectionRegistry as parameter.

This lead to some test failures on branch-2.x, but we'd better also fix them on 
master as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-29 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28405.
---
Resolution: Fixed

Changed split to splitRegionAsync. Pushed the addendum to all branch-2.x.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[\\{pid=26675798, ppid=26674602, state=RUNNABLE; 
> OpenRegionProcedure a92008b76ccae47d55c590930b837036, 
> server=rs-210,60020,1707596461539}]_
> _2024-02-11 10:5

[jira] [Created] (HBASE-28557) Upgrade jline to version 3.x.

2024-04-29 Thread Abhradeep Kundu (Jira)
Abhradeep Kundu created HBASE-28557:
---

 Summary: Upgrade jline to version 3.x.
 Key: HBASE-28557
 URL: https://issues.apache.org/jira/browse/HBASE-28557
 Project: HBase
  Issue Type: Improvement
Reporter: Abhradeep Kundu






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-29 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-28405:
---

We need to apply an addendum to branch-2.x.

We fixed a problem when rolling back MergeTableRegionsProcedure, and then 
TestNamespaceAuditor.testRegionMerge started to test what is really want to 
test, that's why we need to change the exception type from 
DoNotRetryRegionException to DoNotRetryIOException. Before this PR, we will get 
a region state error exception, which is not what we want to test.

But the problem for branch-2 is that, the behavior for the split method is a 
bit different. The spit procedure will fail with quota exceeded, but the method 
will return normally so the assertion will fail. We need to find a way on how 
to get the exception for branch-2.x.

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that

[jira] [Created] (HBASE-28556) Reduce memory copying in Rest server when converting CellModel to Protobuf

2024-04-29 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28556:
---

 Summary: Reduce memory copying in Rest server when converting 
CellModel to Protobuf
 Key: HBASE-28556
 URL: https://issues.apache.org/jira/browse/HBASE-28556
 Project: HBase
  Issue Type: Improvement
  Components: REST
Reporter: Istvan Toth


The REST server does a lot of unneccessary coping, which could be avoided at 
least for protobuf encoding.

- It uses ByteStringer to handle ByteBuffer backed Cells. However, it uses the 
client API, so it sjpuld never encounter ByteBuffer backed cells.
- It clones everything from the cells (sometimes multiple times) before 
serializing to protbuf.

We could mimic the structure in Cell, with array, offset and length for each 
field, and use the appropriate protobuf setters to avoid the extra copies.

There may or may not be a way to do the same for JSON and XML via jax-rs, I 
don't know the frameworks well enough to tell, but if not, we could just do the 
copying in the getters for them.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28555) TestThriftConnection is failing on branch-2

2024-04-28 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28555:
-

 Summary: TestThriftConnection is failing on branch-2
 Key: HBASE-28555
 URL: https://issues.apache.org/jira/browse/HBASE-28555
 Project: HBase
  Issue Type: Bug
  Components: Client, Thrift
Reporter: Duo Zhang
 Fix For: 2.7.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28554) TestZooKeeperScanPolicyObserver and TestAdminShell fail 100% of times on flaky dashboard

2024-04-28 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28554.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Flaky dashboard back to normal.

Resolve.

> TestZooKeeperScanPolicyObserver and TestAdminShell fail 100% of times on 
> flaky dashboard
> 
>
> Key: HBASE-28554
> URL: https://issues.apache.org/jira/browse/HBASE-28554
> Project: HBase
>  Issue Type: Bug
>  Components: shell, test, Zookeeper
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.6.0, 3.0.0-beta-2, 2.5.9
>
>
> This is for branch-2.5+, need to figure out why before cutting any releases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28482) Reverse scan with tags throws ArrayIndexOutOfBoundsException with DBE

2024-04-28 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28482.
---
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-2
   2.5.9
   Resolution: Fixed

Pushed to all active branches. Thanks for the follow-up fix here [~vineet.4008]!

> Reverse scan with tags throws ArrayIndexOutOfBoundsException with DBE
> -
>
> Key: HBASE-28482
> URL: https://issues.apache.org/jira/browse/HBASE-28482
> Project: HBase
>  Issue Type: Bug
>  Components: HFile
>Reporter: Vineet Kumar Maheshwari
>Assignee: Vineet Kumar Maheshwari
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> Facing ArrayIndexOutOfBoundsException when performing reverse scan on a table 
> with 30K+ records in single hfile.
> Exception is happening  when block changes during seekBefore call.
> {code:java}
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>     at 
> org.apache.hadoop.hbase.util.ByteBufferUtils.copyFromBufferToArray(ByteBufferUtils.java:1326)
>     at org.apache.hadoop.hbase.nio.SingleByteBuff.get(SingleByteBuff.java:213)
>     at 
> org.apache.hadoop.hbase.io.encoding.DiffKeyDeltaEncoder$DiffSeekerStateBufferedEncodedSeeker.decode(DiffKeyDeltaEncoder.java:431)
>     at 
> org.apache.hadoop.hbase.io.encoding.DiffKeyDeltaEncoder$DiffSeekerStateBufferedEncodedSeeker.decodeNext(DiffKeyDeltaEncoder.java:502)
>     at 
> org.apache.hadoop.hbase.io.encoding.BufferedDataBlockEncoder$BufferedEncodedSeeker.seekToKeyInBlock(BufferedDataBlockEncoder.java:1012)
>     at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$EncodedScanner.loadBlockAndSeekToKey(HFileReaderImpl.java:1605)
>     at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekBefore(HFileReaderImpl.java:719)
>     at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekBeforeAndSaveKeyToPreviousRow(StoreFileScanner.java:645)
>     at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekToPreviousRowWithoutHint(StoreFileScanner.java:570)
>     at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekToPreviousRow(StoreFileScanner.java:506)
>     at 
> org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.next(ReversedKeyValueHeap.java:126)
>     at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:693)
>     at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:151){code}
>  
> Steps to reproduce:
> Create a table with DataBlockEncoding.DIFF and block size as 1024, write some 
> 30K+ puts with setTTL, then do a reverse scan.
> {code:java}
> @Test
> public void testReverseScanWithDBEWhenCurrentBlockUpdates() throws 
> IOException {
> byte[] family = Bytes.toBytes("0");
> Configuration conf = new Configuration(TEST_UTIL.getConfiguration());
> conf.setInt(HConstants.HBASE_CLIENT_RETRIES_NUMBER, 1);
> try (Connection connection = ConnectionFactory.createConnection(conf)) {
> testReverseScanWithDBE(connection, DataBlockEncoding.DIFF, family, 1024, 
> 3);
> for (DataBlockEncoding encoding : DataBlockEncoding.values()) {
> testReverseScanWithDBE(connection, encoding, family, 1024, 3);
> }
> }
> }
> private void testReverseScanWithDBE(Connection conn, DataBlockEncoding 
> encoding, byte[] family, int blockSize, int maxRows)
> throws IOException {
> LOG.info("Running test with DBE={}", encoding);
> TableName tableName = TableName.valueOf(TEST_NAME.getMethodName() + "-" + 
> encoding);
> TEST_UTIL.createTable(TableDescriptorBuilder.newBuilder(tableName)
> .setColumnFamily(
> ColumnFamilyDescriptorBuilder.newBuilder(family).setDataBlockEncoding(encoding).setBlocksize(blockSize).build())
> .build(), null);
> Table table = conn.getTable(tableName);
> byte[] val1 = new byte[10];
> byte[] val2 = new byte[10];
> Bytes.random(val1);
> Bytes.random(val2);
> for (int i = 0; i < maxRows; i++) {
> table.put(new Put(Bytes.toBytes(i)).addColumn(family, Bytes.toBytes(1), val1)
> .addColumn(family, Bytes.toBytes(2), val2).setTTL(600_000));
> }
> TEST_UTIL.flush(table.getName());
> Scan scan = new Scan();
> scan.setReversed(true);
> try (ResultScanner scanner = table.getScanner(scan)) {
> for (int i = maxRows - 1; i >= 0; i--) {
> Result row = scanner.next();
> assertEquals(2, row.size());
> Cell cell1 = row.getColumnLatestCell(family, Bytes.toBytes(1));
> assertTrue(CellU

[jira] [Created] (HBASE-28554) TestZooKeeperScanPolicyObserver and TestAdminShell fail 100% of times on flaky dashboard

2024-04-28 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28554:
-

 Summary: TestZooKeeperScanPolicyObserver and TestAdminShell fail 
100% of times on flaky dashboard
 Key: HBASE-28554
 URL: https://issues.apache.org/jira/browse/HBASE-28554
 Project: HBase
  Issue Type: Bug
  Components: shell, test, Zookeeper
Reporter: Duo Zhang


This is for branch-2.5+, need to figure out why before cutting any releases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28405) Region open procedure silently returns without notifying the parent proc

2024-04-27 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28405.
---
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-2
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~mnpoonia] for contributing and [~vjasani] for reviewing!

> Region open procedure silently returns without notifying the parent proc
> 
>
> Key: HBASE-28405
> URL: https://issues.apache.org/jira/browse/HBASE-28405
> Project: HBase
>  Issue Type: Bug
>  Components: proc-v2, Region Assignment
>Affects Versions: 2.4.17, 2.5.8
>Reporter: Aman Poonia
>Assignee: Aman Poonia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> *We had a scenario in production where a merge operation had failed as below*
> _2024-02-11 10:53:57,715 ERROR [PEWorker-31] 
> assignment.MergeTableRegionsProcedure - Error trying to merge 
> [a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b] in 
> table1 (in state=MERGE_TABLE_REGIONS_CLOSE_REGIONS)_
> _org.apache.hadoop.hbase.HBaseIOException: The parent region state=MERGING, 
> location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up_
> _at 
> org.apache.hadoop.hbase.master.assignment.AssignmentManagerUtil.createUnassignProceduresForSplitOrMerge(AssignmentManagerUtil.java:120)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.createUnassignProcedures(MergeTableRegionsProcedure.java:648)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:205)_
> _at 
> org.apache.hadoop.hbase.master.assignment.MergeTableRegionsProcedure.executeFromState(MergeTableRegionsProcedure.java:79)_
> _at 
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:188)_
> _at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:922)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1650)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1396)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1000(ProcedureExecutor.java:75)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.runProcedure(ProcedureExecutor.java:1964)_
> _at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:216)_
> _at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1991)_
> *Now when we do rollback of failed merge operation we see a issue where 
> region is in state opened until the RS holding it stopped.*
> Rollback create a TRSP as below
> _2024-02-11 10:53:57,719 DEBUG [PEWorker-31] procedure2.ProcedureExecutor - 
> Stored [pid=26674602, 
> state=RUNNABLE:REGION_STATE_TRANSITION_GET_ASSIGN_CANDIDATE; 
> TransitRegionStateProcedure table=table1, 
> region=a92008b76ccae47d55c590930b837036, ASSIGN]_
> *and rollback finished successfully*
> _2024-02-11 10:53:57,721 INFO [PEWorker-31] procedure2.ProcedureExecutor - 
> Rolled back pid=26673594, state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.HBaseIOException via 
> master-merge-regions:org.apache.hadoop.hbase.HBaseIOException: The parent 
> region state=MERGING, location=rs-229,60020,1707587658182, table=table1, 
> region=f56752ae9f30fad9de5a80a8ba578e4b is currently in transition, give up; 
> MergeTableRegionsProcedure table=table1, 
> regions=[a92008b76ccae47d55c590930b837036, f56752ae9f30fad9de5a80a8ba578e4b], 
> force=false exec-time=1.4820 sec_
> *We create a procedure to open the region a92008b76ccae47d55c590930b837036. 
> Intrestingly we didnt close the region as creation of procedure to close 
> regions had thrown exception and not execution of procedure. When we run TRSP 
> it sends a OpenRegionProcedure which is handled by AssignRegionHandler. This 
> handlers on execution suggests that region is already online*
> Sequence of events are as follow
> _2024-02-11 10:53:58,919 INFO [PEWorker-58] assignment.RegionStateStore - 
> pid=26674602 updating hbase:meta row=a92008b76ccae47d55c590930b837036, 
> regionState=OPENING, regionLocation=rs-210,60020,1707596461539_
> _2024-02-11 10:53:58,920 INFO [PEWorker-58] procedure2.ProcedureExecutor - 
> Initialized subprocedures=[\\{pid=26675798, 

[jira] [Resolved] (HBASE-28552) Bump up bouncycastle dependency from 1.76 to 1.78

2024-04-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28552.
---
Fix Version/s: 2.6.0
   2.4.18
   3.0.0-beta-2
   2.5.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to all active branches.

Thanks [~nikitapande] for contributing!

> Bump up bouncycastle dependency from 1.76 to 1.78
> -
>
> Key: HBASE-28552
> URL: https://issues.apache.org/jira/browse/HBASE-28552
> Project: HBase
>  Issue Type: Improvement
>  Components: dependencies, security
>Reporter: Nikita Pande
>Assignee: Nikita Pande
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> org.bouncycastle : bcprov-jdk18on : 1.76 to be upgraded to latest  
> org.bouncycastle : bcprov-jdk18on : 1.78
> Refer [link 
> org.bouncycastle|https://security.snyk.io/package/maven/org.bouncycastle:bcprov-debug-jdk18on]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28512) Update error prone to 2.26.1

2024-04-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28512.
---
Fix Version/s: 2.6.0
   3.0.0-beta-2
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to branch-2.6+.

Thanks [~sunxin] for reviewing!

> Update error prone to 2.26.1
> 
>
> Key: HBASE-28512
> URL: https://issues.apache.org/jira/browse/HBASE-28512
> Project: HBase
>  Issue Type: Task
>  Components: dependencies
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 3.0.0-beta-2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28553) SSLContext not used for Kerberos auth negotiation in rest client

2024-04-25 Thread Istvan Toth (Jira)
Istvan Toth created HBASE-28553:
---

 Summary: SSLContext not used for Kerberos auth negotiation in rest 
client
 Key: HBASE-28553
 URL: https://issues.apache.org/jira/browse/HBASE-28553
 Project: HBase
  Issue Type: Bug
  Components: REST
Reporter: Istvan Toth
Assignee: Istvan Toth


The included REST client now supports specifying a Trust store for SSL 
connections.
However, the configured SSL library is not used when the Kerberos negotation is 
performed by the Hadoop library, which uses its own client.

We need to set up the Hadoop auth process to use the same SSLContext.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28436) Use connection url to specify the connection registry information

2024-04-25 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28436.
---
Resolution: Fixed

> Use connection url to specify the connection registry information
> -
>
> Key: HBASE-28436
> URL: https://issues.apache.org/jira/browse/HBASE-28436
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2
>
>
> As describe in this email from [~ndimiduk]
> https://lists.apache.org/thread/98wqlkqvlnmpx3r7yrg9mw4pqz9ppofh
> The first advantage here is that, we can encode the connection registry 
> implementation in the scheme of the connection url, so for replication, we 
> can now support cluster key other than zookeeper, which is important for us 
> to remove zookeeper dependency on our public facing APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28518) Allow specifying a filter for the REST multiget endpoint

2024-04-25 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-28518.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed the addendum to all active branches.

Thanks [~stoty] for the quick fix.

> Allow specifying a filter for the REST multiget endpoint
> 
>
> Key: HBASE-28518
> URL: https://issues.apache.org/jira/browse/HBASE-28518
> Project: HBase
>  Issue Type: Improvement
>  Components: REST
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 3.0.0-beta-2, 2.5.9
>
>
> The native HBase API allows specifying Filters for get operations.
> The REST interface does not currently expose this functionality.
> Add a parameter to the multiget enpoint to allow specifying filters.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (HBASE-28436) Use connection url to specify the connection registry information

2024-04-25 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reopened HBASE-28436:
---

For applying addendum to address the naming issue, we changed 
ConnectionRegistryCreator to ConnectionRegistryURIFactory but forgot to change 
the name for sub classes...

> Use connection url to specify the connection registry information
> -
>
> Key: HBASE-28436
> URL: https://issues.apache.org/jira/browse/HBASE-28436
> Project: HBase
>  Issue Type: Sub-task
>  Components: Client
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.7.0, 3.0.0-beta-2
>
>
> As describe in this email from [~ndimiduk]
> https://lists.apache.org/thread/98wqlkqvlnmpx3r7yrg9mw4pqz9ppofh
> The first advantage here is that, we can encode the connection registry 
> implementation in the scheme of the connection url, so for replication, we 
> can now support cluster key other than zookeeper, which is important for us 
> to remove zookeeper dependency on our public facing APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28517) Make properties dynamically configured

2024-04-25 Thread Peter Somogyi (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Somogyi resolved HBASE-28517.
---
Fix Version/s: 2.6.0
   2.4.18
   4.0.0-alpha-1
   3.0.0-beta-2
   2.5.9
 Release Note: 
Make the following properties dynamically configured:
* hbase.rs.evictblocksonclose
* hbase.rs.cacheblocksonwrite
* hbase.block.data.cacheonread
   Resolution: Fixed

Merged to all active branches. Thanks, [~kabhishek4] for the contribution!

> Make properties dynamically configured
> --
>
> Key: HBASE-28517
> URL: https://issues.apache.org/jira/browse/HBASE-28517
> Project: HBase
>  Issue Type: Improvement
>Reporter: Abhishek Kothalikar
>Assignee: Abhishek Kothalikar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 4.0.0-alpha-1, 3.0.0-beta-2, 2.5.9
>
>
> Make following properties dynamically configured, 
>   hbase.rs.evictblocksonclose
>   hbase.rs.cacheblocksonwrite
>   hbase.block.data.cacheonread
> for use case scenarios where configuring them dynamically would help in 
> achieving better throughput.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28552) Update bouncycastle dependency

2024-04-25 Thread Nikita Pande (Jira)
Nikita Pande created HBASE-28552:


 Summary: Update bouncycastle dependency
 Key: HBASE-28552
 URL: https://issues.apache.org/jira/browse/HBASE-28552
 Project: HBase
  Issue Type: Improvement
Reporter: Nikita Pande


org.bouncycastle : bcprov-jdk18on : 1.76 to be upgraded to latest  
org.bouncycastle : bcprov-jdk18on : 1.78

Refer [link 
org.bouncycastle|https://security.snyk.io/package/maven/org.bouncycastle:bcprov-debug-jdk18on]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


  1   2   3   4   5   6   7   8   9   10   >