[jira] [Created] (HBASE-28534) Authentication failure when running hbase-spark in local mode

2024-04-17 Thread Junegunn Choi (Jira)
Junegunn Choi created HBASE-28534:
-

 Summary: Authentication failure when running hbase-spark in local 
mode
 Key: HBASE-28534
 URL: https://issues.apache.org/jira/browse/HBASE-28534
 Project: HBase
  Issue Type: Bug
  Components: spark
Affects Versions: connector-1.0.0
Reporter: Junegunn Choi


h2. Problem

When running Spark in local mode, hbase-spark fails to authenticate to a 
Kerberos secured HBase cluster. The error message is:
{quote}No matching SASL authentication provider and supporting token found from 
providers for user: x...@xxx.xxx (auth:PROXY)
{quote}
That is because {{applyCreds}} changes the authentication method of the current 
user to {{{}PROXY{}}}, when it should still be {{KERBEROS}} for local mode to 
run correctly.
h2. Suggested solution

To fix this, I propose removing {{{}applyCreds{}}}. The function is no longer 
needed and should be removed:

1. Because we should not change the authentication mode of the current user in 
local mode
2. And because the purpose of the function is no longer valid. It is not doing 
anything meaningful since the broadcasting of the user credentials was removed 
in this commit:
[https://github.com/apache/hbase-connectors/commit/75e41365207408f5b47d5925469a49fd60078b5e]

A pull request is on the way.
h2. Testing

The fix was manually tested against Kerberos secured HBase 2.4.17 + Hadoop 
3.3.5 cluster and with Spark 3.5.1, both in local mode and with Yarn master, 
running the following Python code.
{code:java}
df = (spark.read.format("org.apache.hadoop.hbase.spark")
.option("hbase.columns.mapping", "key STRING :key, state STRING 
info:state")
.option("hbase.table", "hbase:meta").load())
df.first()
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28533) Region split failure due to region quota limit leaves Hmaster's in memory state for the region in SPLITTING after procedure rollback

2024-04-17 Thread Daniel Roudnitsky (Jira)
Daniel Roudnitsky created HBASE-28533:
-

 Summary: Region split failure due to region quota limit leaves 
Hmaster's in memory state for the region in SPLITTING after procedure rollback
 Key: HBASE-28533
 URL: https://issues.apache.org/jira/browse/HBASE-28533
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 2.5.8
 Environment: HBase Version 2.5.8, 
r37444de6531b1bdabf2e445c83d0268ab1a6f919, Thu Feb 29 15:37:32 PST 2024
Reporter: Daniel Roudnitsky


When a SplitTableRegionProcedure is run for a region whose namespace is at its 
maximum region quota limit, the split procedure will fail and rollback, and 
Hmaster's in memory RegionStateNode for the region is left in a SPLITTING 
state. Hmaster will then refuse to start any subsequent merge/split/move 
procedures for that region because it believes the region is not OPEN, until it 
is restarted and the in memory record of region states is reset.

In the first step of the split procedure SPLIT_TABLE_REGION_PREPARE the parent 
region's RegionStateNode state is set to SPLITTING, and the transition is not 
written to the meta table. In the next step SPLIT_TABLE_REGION_PRE_OPERATION 
the region quota check is done, QuotaExceededException is thrown and the 
procedure ends in ROLLEDBACK state without reverting the RegionStateNode back 
to OPEN state. Hmaster is left believing the region is in a SPLITTING state 
according to its in memory RegionStates, while the region is still online on 
the assigned region server and according to meta.

To reproduce in HBase shell:

{code:java}
> create_namespace 'test_ns', {'hbase.namespace.quota.maxregions'=> 2}
> create 'test_ns:test_table', 'f1', {NUMREGIONS => 2, SPLITALGO => 
> 'UniformSplit'}
> region_a = 
> region_b = 

> split region_a, 'x'
# HMaster will report: 
pid=405, state=ROLLEDBACK, 
exception=org.apache.hadoop.hbase.quotas.QuotaExceededException via 
master-split-regions:org.apache.hadoop.hbase.quotas.QuotaExceededException: 
Region split not possible for : as quota limits are exceeded ; 
SplitTableRegionProcedure table=test_ns:test_table, parent=...

> merge_region region_a, region_b
ERROR: org.apache.hadoop.hbase.exceptions.MergeRegionException: 
org.apache.hadoop.hbase.client.DoNotRetryRegionException:  is not 
OPEN; state=SPLITTING

> stop_master # trigger hmaster failover 
> merge_region region_a, region_b # merge now succeeds {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28532) remove vulnerable slf4j-log4j12 dependency

2024-04-17 Thread Nikita Pande (Jira)
Nikita Pande created HBASE-28532:


 Summary: remove vulnerable slf4j-log4j12 dependency
 Key: HBASE-28532
 URL: https://issues.apache.org/jira/browse/HBASE-28532
 Project: HBase
  Issue Type: Improvement
Reporter: Nikita Pande


slf4j-log4j12 is a bridge from SLF4J to Log4j 1.x.

Since log4j 1.x is vulnerable , so this needs to be removed.

 

It is to be replaced with the log4j-slf4j-impl dependency, which is a bridge 
from SLF4J to Log4j 2.x.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28500) Rest Java client library assumes stateless servers

2024-04-17 Thread Istvan Toth (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Istvan Toth resolved HBASE-28500.
-
Resolution: Fixed

> Rest Java client library assumes stateless servers
> --
>
> Key: HBASE-28500
> URL: https://issues.apache.org/jira/browse/HBASE-28500
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Reporter: Istvan Toth
>Assignee: Istvan Toth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.4.18, 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.5.9
>
>
> The Rest Java client library accepts a list of rest servers, and does random 
> load balancing between them for each request.
> This does not work for scans, which do have state on the rest server instance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28531) IndexOutOfBoundsException when executing HBCK2

2024-04-17 Thread guluo (Jira)
guluo created HBASE-28531:
-

 Summary: IndexOutOfBoundsException when executing HBCK2 
 Key: HBASE-28531
 URL: https://issues.apache.org/jira/browse/HBASE-28531
 Project: HBase
  Issue Type: Bug
  Components: hbck2
 Environment: hbck master
hbase master
Reporter: guluo


Reproduction
 
Execute the following command:
{code:java}
//代码占位符
${HBASE_HOME}/bin/hbase --config /etc/hbase-conf hbck -j 
~/hbase-operator-tools/hbase-hbck2/target/hbase-hbck2-1.0.0-SNAPSHOT.jar {code}
we would get IndexOutOfBoundsException, as following.
{code:java}
//代码占位符
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of 
range: -1    at java.lang.String.substring(String.java:1967)    at 
org.apache.logging.log4j.util.PropertiesUtil.partitionOnCommonPrefixes(PropertiesUtil.java:555)
    at 
org.apache.logging.log4j.core.config.properties.PropertiesConfigurationBuilder.build(PropertiesConfigurationBuilder.java:174)
    at 
org.apache.logging.log4j.core.config.properties.PropertiesConfigurationFactory.getConfiguration(PropertiesConfigurationFactory.java:56)
    at 
org.apache.logging.log4j.core.config.properties.PropertiesConfigurationFactory.getConfiguration(PropertiesConfigurationFactory.java:35)
    at 
org.apache.logging.log4j.core.config.ConfigurationFactory$Factory.getConfiguration(ConfigurationFactory.java:557)
    at 
org.apache.logging.log4j.core.config.ConfigurationFactory$Factory.getConfiguration(ConfigurationFactory.java:481)
    at 
org.apache.logging.log4j.core.config.ConfigurationFactory.getConfiguration(ConfigurationFactory.java:323)
    at 
org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:695) 
   at 
org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:716) 
   at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:270) 
   at 
org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:155)
    at 
org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:47)
    at org.apache.logging.log4j.LogManager.getContext(LogManager.java:196)    
at 
org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:137)
    at 
org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:55)
    at 
org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:47)
    at 
org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:33)
    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358)    at 
org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383)    at 
org.apache.hbase.HBCK2.(HBCK2.java:92) {code}
The Reason.
The current version of HBase uses version 2.17.2 of log4j2, which supports 
shorthand syntax for properties configuration ( LOG4J2-3341 :   
https://issues.apache.org/jira/browse/LOG4J2-3341).
 
However, The current version of HBCK2 uses version 2.17.1 of log4j2, which does 
not support the feature.

So, we would get IndexOutOfBoundsException when HBCK2 uses as following log4j2 
properties, and this is the default log configuration format for HBase
logger.http = INFO,NullAppender

In order to avoid this problem, I think we need bump log4j2 from 2.17.1 to 
2.17.2 against HBCK2.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28530) Better not use threads when parallel seek enabled and only one storescanner to seek

2024-04-17 Thread Rajeshbabu Chintaguntla (Jira)
Rajeshbabu Chintaguntla created HBASE-28530:
---

 Summary: Better not use threads when parallel seek enabled and 
only one storescanner to seek
 Key: HBASE-28530
 URL: https://issues.apache.org/jira/browse/HBASE-28530
 Project: HBase
  Issue Type: Improvement
Reporter: Rajeshbabu Chintaguntla
Assignee: Rajeshbabu Chintaguntla


When parallel seek enabled, seeking through the scanners using multiple threads 
and waiting on the countdown lock to complete the seek on all the scanners. It 
would be better not to use threads when there is only one scanners to seek. 
Might not be significant improvement but will be useful when a region has one 
store file post major compaction.

{code:java}
  private void parallelSeek(final List scanners, 
final Cell kv)
throws IOException {
if (scanners.isEmpty()) return;
int storeFileScannerCount = scanners.size();
CountDownLatch latch = new CountDownLatch(storeFileScannerCount);
List handlers = new ArrayList<>(storeFileScannerCount);
for (KeyValueScanner scanner : scanners) {
  if (scanner instanceof StoreFileScanner) {
ParallelSeekHandler seekHandler = new ParallelSeekHandler(scanner, kv, 
this.readPt, latch);
executor.submit(seekHandler);
handlers.add(seekHandler);
  } else {
scanner.seek(kv);
latch.countDown();
  }
}

try {
  latch.await();
} catch (InterruptedException ie) {
  throw (InterruptedIOException) new InterruptedIOException().initCause(ie);
}
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-28529) Use ZKClientConfig instead of system properties when setting zookeeper configurations

2024-04-17 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-28529:
-

 Summary: Use ZKClientConfig instead of system properties when 
setting zookeeper configurations
 Key: HBASE-28529
 URL: https://issues.apache.org/jira/browse/HBASE-28529
 Project: HBase
  Issue Type: Improvement
Reporter: Duo Zhang


In HBASE-28340, we allow loading zookeeper configurations from hbase 
configurations, but then we use system properties to pass these parameters when 
creating zookeeper client.

For replication, we may want to use different zookeeper configurations 
comparing to the ones we use for starting this hbase cluster, so using system 
properties to pass these parameters is not suitable then.

We should make use of ZKClientConfig to pass these flags.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)