[jira] [Resolved] (HADOOP-18962) Upgrade kafka to 3.4.0

2024-05-24 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18962.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> Upgrade kafka to 3.4.0
> --
>
> Key: HADOOP-18962
> URL: https://issues.apache.org/jira/browse/HADOOP-18962
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: D M Murali Krishna Reddy
>Assignee: D M Murali Krishna Reddy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Upgrade kafka-clients to 3.4.0 to fix 
> https://nvd.nist.gov/vuln/detail/CVE-2023-25194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19186) Change loglevel to ERROR/WARNING so that it would easy to identify the problem without ignoring

2024-05-24 Thread Srinivasu Majeti (Jira)
Srinivasu Majeti created HADOOP-19186:
-

 Summary: Change loglevel to ERROR/WARNING so that it would easy to 
identify the problem without ignoring
 Key: HADOOP-19186
 URL: https://issues.apache.org/jira/browse/HADOOP-19186
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Srinivasu Majeti


On the new Host with Java version 11, the DN was not able to communicate with 
the NN. We enabled DEBUG logging for the DN and the below message was logged 
under DEBUG level.

DEBUG org.apache.hadoop.security.UserGroupInformation: 
PrivilegedActionException as:hdfs/av3l704p.bigdata.it.internal@PRODUCTION.LOCAL 
(auth:KERBEROS) cause:javax.security.sasl.SaslExcept
ion: GSS initiate failed [Caused by GSSException: No valid credentials provided 
(Mechanism level: Receive timed out)]

Without a DEBUG level logging, this was shown up as a WARNING as below

WARN org.apache.hadoop.ipc.Client: Couldn't setup connection for 
hdfs/av3l704p.bigdata.it.internal@PRODUCTION.LOCAL to 
avl2785p.bigdata.it.internal/172.24.178.32:8022
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Receive timed out)]

A considerable amount of time was spent troubleshooting this issue as this 
exception was moved to a DEBUG level which was difficult to track in the logs.

Can we have such critical WARNINGs shown up at the WARN/ERROR level so that 
it's not missed when we enable DEBUG level logging for datanodes?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19168) Upgrade Kafka Clients due to CVEs

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19168.
-
Resolution: Duplicate

rohit, dupe of HADOOP-18962. let's focus on that

> Upgrade Kafka Clients due to CVEs
> -
>
> Key: HADOOP-19168
> URL: https://issues.apache.org/jira/browse/HADOOP-19168
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Rohit Kumar
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade Kafka Clients due to CVEs
> CVE-2023-25194:- Affected versions of this package are vulnerable to 
> Deserialization of Untrusted Data when there are gadgets in the 
> {{{}classpath{}}}. The server will connect to the attacker's LDAP server and 
> deserialize the LDAP response, which the attacker can use to execute java 
> deserialization gadget chains on the Kafka connect server.
> CVSS Score:- 8.8(High)
> [https://nvd.nist.gov/vuln/detail/CVE-2023-25194] 
> CVE-2021-38153
> CVE-2018-17196
> Insufficient Entropy
> [https://security.snyk.io/package/maven/org.apache.kafka:kafka-clients] 
> Upgrade Kafka-Clients to 3.4.0 or higher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19182) Upgrade kafka to 3.4.0

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19182.
-
Resolution: Duplicate

> Upgrade kafka to 3.4.0
> --
>
> Key: HADOOP-19182
> URL: https://issues.apache.org/jira/browse/HADOOP-19182
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: fuchaohong
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade kafka to 3.4.0 to resolve CVE-2023-25194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19185) Improve ABFS metric integration with iOStatistics

2024-05-23 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19185:
---

 Summary: Improve ABFS metric integration with iOStatistics
 Key: HADOOP-19185
 URL: https://issues.apache.org/jira/browse/HADOOP-19185
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Reporter: Steve Loughran


Followup to HADOOP-18325 covering the outstanding comments of

https://github.com/apache/hadoop/pull/6314/files





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18325) ABFS: Add correlated metric support for ABFS operations

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18325.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> ABFS: Add correlated metric support for ABFS operations
> ---
>
> Key: HADOOP-18325
> URL: https://issues.apache.org/jira/browse/HADOOP-18325
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.3
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Add metrics related to a particular job, specific to number of total 
> requests, retried requests, retry count and others



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19184) TestStagingCommitter.testJobCommitFailure failing

2024-05-22 Thread Mukund Thakur (Jira)
Mukund Thakur created HADOOP-19184:
--

 Summary: TestStagingCommitter.testJobCommitFailure failing 
 Key: HADOOP-19184
 URL: https://issues.apache.org/jira/browse/HADOOP-19184
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Mukund Thakur
Assignee: Mukund Thakur


[INFO] [ERROR] Failures: [ERROR] TestStagingCommitter.testJobCommitFailure:662 
[Committed objects compared to deleted paths 
org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase$ClientResults@2de1acf4\{
 requests=12, uploads=12, parts=12, tagsByUpload=12, commits=5, aborts=7, 
deletes=0}] Expecting: 
<["s3a://bucket-name/output/path/r_0_0_c055250c-58c7-47ea-8b14-215cb5462e89", 
"s3a://bucket-name/output/path/r_1_1_9111aa65-96c2-465c-b278-696aff7707e3", 
"s3a://bucket-name/output/path/r_0_0_dec7f398-ee4e-4a53-a783-6b72cead569a", 
"s3a://bucket-name/output/path/r_1_1_39ad0eba-1053-4217-aa63-ddc8edfa7c64", 
"s3a://bucket-name/output/path/r_0_0_6c0518f6-7c1b-418f-a3e4-7db568880e6a"]> to 
contain exactly in any order: <[]> but the following elements were unexpected: 
<["s3a://bucket-name/output/path/r_0_0_c055250c-58c7-47ea-8b14-215cb5462e89", 
"s3a://bucket-name/output/path/r_1_1_9111aa65-96c2-465c-b278-696aff7707e3", 
"s3a://bucket-name/output/path/r_0_0_dec7f398-ee4e-4a53-a783-6b72cead569a", 
"s3a://bucket-name/output/path/r_1_1_39ad0eba-1053-4217-aa63-ddc8edfa7c64", 
"s3a://bucket-name/output/path/r_0_0_6c0518f6-7c1b-418f-a3e4-7db568880e6a"]>



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19183) RBF: Support leader follower mode for multiple subclusters

2024-05-22 Thread Yuanbo Liu (Jira)
Yuanbo Liu created HADOOP-19183:
---

 Summary: RBF: Support leader follower mode for multiple subclusters
 Key: HADOOP-19183
 URL: https://issues.apache.org/jira/browse/HADOOP-19183
 Project: Hadoop Common
  Issue Type: Improvement
  Components: RBF
Reporter: Yuanbo Liu


Currently there are five modes in multiple subclusters like
HASH, LOCAL, RANDOM, HASH_ALL,SPACE;

Proposal a new mode called leader/follower mode. routers try to write to leader 
subcluster as many as possible. When routers read data, put leader subcluster 
into first rank.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19182) Upgrade kafka to 3.4.0

2024-05-22 Thread fuchaohong (Jira)
fuchaohong created HADOOP-19182:
---

 Summary: Upgrade kafka to 3.4.0
 Key: HADOOP-19182
 URL: https://issues.apache.org/jira/browse/HADOOP-19182
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: fuchaohong






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19163) Upgrade protobuf version to 3.25.3

2024-05-21 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19163.
-
Resolution: Fixed

done. not sure what version to tag with.

Proposed: we cut a new release of this

> Upgrade protobuf version to 3.25.3
> --
>
> Key: HADOOP-19163
> URL: https://issues.apache.org/jira/browse/HADOOP-19163
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hadoop-thirdparty
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13147) Constructors must not call overrideable methods in PureJavaCrc32C

2024-05-20 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HADOOP-13147.
---
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Constructors must not call overrideable methods in PureJavaCrc32C
> -
>
> Key: HADOOP-13147
> URL: https://issues.apache.org/jira/browse/HADOOP-13147
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 2.0.6-alpha
> Environment: 
> http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java
>Reporter: Sebb
>Assignee: Sebb
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Constructors must not call overrideable methods.
> An object is not guaranteed fully constructed until the constructor exits, so 
> the subclass override may not see the fully created parent object.
> This applies to:
> PureJavaCrc32



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19181) IAMCredentialsProvider throttle failures

2024-05-20 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19181:
---

 Summary: IAMCredentialsProvider throttle failures
 Key: HADOOP-19181
 URL: https://issues.apache.org/jira/browse/HADOOP-19181
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


Tests report throttling errors in IAM being remapped to noauth and failure

Again, impala tests, but with multiple processes on same host. this means that 
HADOOP-18945 isn't sufficient as even if it ensures a singleton instance for a 
process
* it doesn't if there are many test buckets (fixable)
* it doesn't work across processes (not fixable)

we may be able to 
* use a singleton across all filesystem instances
* once we know how throttling is reported, handle it through retries + 
error/stats collection


{code}
2024-02-17T18:02:10,175  WARN [TThreadPoolServer WorkerProcess-22] 
fs.FileSystem: Failed to initialize fileystem 
s3a://impala-test-uswest2-1/test-warehouse/test_num_values_def_levels_mismatch_15b31ddb.db/too_many_def_levels:
 java.nio.file.AccessDeniedException: impala-test-uswest2-1: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
2024-02-17T18:02:10,175 ERROR [TThreadPoolServer WorkerProcess-22] 
utils.MetaStoreUtils: Got exception: java.nio.file.AccessDeniedException 
impala-test-uswest2-1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No 
AWS Credentials provided by TemporaryAWSCredentialsProvider 
SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
java.nio.file.AccessDeniedException: impala-test-uswest2-1: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.maybeTranslateCredentialException(AWSCredentialProviderList.java:351)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:201) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$2(S3AFileSystem.java:972)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2748)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:970)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.doBucketProbing(S3AFileSystem.java:859) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:715) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3452) 
~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.FileSystem.access

[jira] [Resolved] (HADOOP-19167) Change of Codec configuration does not work

2024-05-16 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu resolved HADOOP-19167.
---
Fix Version/s: 3.5.0
   Resolution: Fixed

> Change of Codec configuration does not work
> ---
>
> Key: HADOOP-19167
> URL: https://issues.apache.org/jira/browse/HADOOP-19167
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: compress
>Reporter: Zhikai Hu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> In one of my projects, I need to dynamically adjust compression level for 
> different files. 
> However, I found that in most cases the new compression level does not take 
> effect as expected, the old compression level continues to be used.
> Here is the relevant code snippet:
> ZStandardCodec zStandardCodec = new ZStandardCodec();
> zStandardCodec.setConf(conf);
> conf.set("io.compression.codec.zstd.level", "5"); // level may change 
> dynamically
> conf.set("io.compression.codec.zstd", zStandardCodec.getClass().getName());
> writer = SequenceFile.createWriter(conf, 
> SequenceFile.Writer.file(sequenceFilePath),
>                                 
> SequenceFile.Writer.keyClass(LongWritable.class),
>                                 
> SequenceFile.Writer.valueClass(BytesWritable.class),
>                                 
> SequenceFile.Writer.compression(CompressionType.BLOCK));
> The reason is SequenceFile.Writer.init() method will call 
> CodecPool.getCompressor(codec, null) to get a compressor. 
> If the compressor is a reused instance, the conf is not applied because it is 
> passed as null:
> public static Compressor getCompressor(CompressionCodec codec, Configuration 
> conf) {
> Compressor compressor = borrow(compressorPool, codec.getCompressorType());
> if (compressor == null)
> { compressor = codec.createCompressor(); LOG.info("Got brand-new compressor 
> ["+codec.getDefaultExtension()+"]"); }
> else {
> compressor.reinit(conf);   //conf is null here
> ..
>  
> Please also refer to my unit test to reproduce the bug. 
> To address this bug, I modified the code to ensure that the configuration is 
> read back from the codec when a compressor is reused.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18759) [ABFS][Backoff-Optimization] Have a Static retry policy for connection timeout failures

2024-05-16 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi resolved HADOOP-18759.

   Fix Version/s: 3.4.1
  (was: 3.5.0)
Release Note: https://github.com/apache/hadoop/pull/5881
Target Version/s: 3.4.0  (was: 3.3.4)
  Resolution: Fixed

[Hadoop-18759: [ABFS][Backoff-Optimization] Have a Static retry policy for 
connection timeout. by anujmodi2021 · Pull Request #5881 · apache/hadoop 
(github.com)|https://github.com/apache/hadoop/pull/5881]

> [ABFS][Backoff-Optimization] Have a Static retry policy for connection 
> timeout failures
> ---
>
> Key: HADOOP-18759
> URL: https://issues.apache.org/jira/browse/HADOOP-18759
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.4
>Reporter: Anuj Modi
>Assignee: Anuj Modi
>Priority: Major
> Fix For: 3.4.1
>
>
> Today when a request fails with connection timeout, it falls back into the 
> loop for exponential retry. Unlike Azure Storage, there are no guarantees of 
> success on exponentially retried request or recommendations for ideal retry 
> policies for Azure network or any other general failures. Faster failure and 
> retry might be more beneficial for such generic connection timeout failures. 
> This PR introduces a new Static Retry Policy which will currently be used 
> only for Connection Timeout failures. It means all the requests failing with 
> Connection Timeout errors will be retried after a constant retry(sleep) 
> interval independent of how many times that request has failed. Max Retry 
> Count check will still be in place.
> Following Configurations will be introduced in the change:
>  # "fs.azure.static.retry.for.connection.timeout.enabled" - default: true, 
> true: static retry will be used for CT, false: Exponential retry will be used.
>  # "fs.azure.static.retry.interval" - default: 1000ms.
> This also introduces a new field in x-ms-client-request-id only for the 
> requests that are being retried after connection timeout failure. New filed 
> will tell what retry policy was used to get the sleep interval before making 
> this request.
> Header "x-ms-client-request-id " right now has only the retryCount and 
> retryReason this particular API call is. For ex:  
> :eb06d8f6-5693-461b-b63c-5858fa7655e6:29cb0d19-2b68-4409-bc35-cb7160b90dd8:::CF:1_CT.
> Moving ahead for retryReason "CT" it will have retry policy abbreviation as 
> well.
> For ex:  
> :eb06d8f6-5693-461b-b63c-5858fa7655e6:29cb0d19-2b68-4409-bc35-cb7160b90dd8:::CF:1_CT_E.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18011) ABFS: Enable config control for default connection timeout

2024-05-16 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi resolved HADOOP-18011.

Fix Version/s: 3.4.1
 Hadoop Flags: Reviewed
 Release Note: https://github.com/apache/hadoop/pull/5881
   Resolution: Fixed

PR checked in: [Hadoop-18759: [ABFS][Backoff-Optimization] Have a Static retry 
policy for connection timeout. by anujmodi2021 · Pull Request #5881 · 
apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/5881]

> ABFS: Enable config control for default connection timeout 
> ---
>
> Key: HADOOP-18011
> URL: https://issues.apache.org/jira/browse/HADOOP-18011
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.1
>Reporter: Sneha Vijayarajan
>Assignee: Sneha Vijayarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> ABFS driver has a default connection timeout and read timeout value of 30 
> secs. For jobs that are time sensitive, preference would be quick failure and 
> have shorter HTTP connection and read timeout. 
> This Jira is created enable config control over the default connection and 
> read timeout. 
> New config name:
> fs.azure.http.connection.timeout
> fs.azure.http.read.timeout



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720

2024-05-16 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19172.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
   Resolution: Fixed

> Upgrade aws-java-sdk to 1.12.720
> 
>
> Key: HADOOP-19172
> URL: https://issues.apache.org/jira/browse/HADOOP-19172
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Update to the latest AWS SDK, to stop anyone worrying about the ion library 
> CVE https://nvd.nist.gov/vuln/detail/CVE-2024-21634
> This isn't exposed in the s3a client, but may be used downstream. 
> on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager

2024-05-15 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen resolved HADOOP-18851.
-
Resolution: Fixed

> Performance improvement for DelegationTokenSecretManager
> 
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 
> t

[jira] [Created] (HADOOP-19179) ABFS: Support FNS Accounts over BlobEndpoint

2024-05-15 Thread Sneha Vijayarajan (Jira)
Sneha Vijayarajan created HADOOP-19179:
--

 Summary: ABFS: Support FNS Accounts over BlobEndpoint
 Key: HADOOP-19179
 URL: https://issues.apache.org/jira/browse/HADOOP-19179
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.4.0
Reporter: Sneha Vijayarajan
Assignee: Sneha Vijayarajan
 Fix For: 3.5.0, 3.4.1


As a pre-requisite to deprecating WASB Driver, ABFS Driver will need to match 
FNS account support as intended by WASB driver. This will provide an official 
migrating means for customers still using the legacy driver to ABFS Driver. 

 

Parent Jira for WASB deprecation: [HADOOP-19178] WASB Driver Deprecation and 
eventual removal - ASF JIRA (apache.org)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19178) WASB Driver Deprecation and eventual removal

2024-05-15 Thread Sneha Vijayarajan (Jira)
Sneha Vijayarajan created HADOOP-19178:
--

 Summary: WASB Driver Deprecation and eventual removal
 Key: HADOOP-19178
 URL: https://issues.apache.org/jira/browse/HADOOP-19178
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.4.0
Reporter: Sneha Vijayarajan
Assignee: Sneha Vijayarajan
 Fix For: 3.4.1


*WASB Driver*

WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder operations 
hence are mimicked at client side by WASB driver and certain folder operations 
like Rename and Delete can lead to lot of IOPs with client-side enumeration and 
orchestration of rename/delete operation blob by blob. It was not ideal for 
other APIs too as initial checks for path is a file or folder needs to be done 
over multiple metadata calls. These led to a degraded performance.

 

To provide better service to Analytics customers, Microsoft released ADLS Gen2 
which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. ABFS 
driver was designed to overcome the inherent deficiencies of WASB and customers 
were informed to migrate to ABFS driver.

 

*Customers who still use the legacy WASB driver and the challenges they face* 

Some of our customers have not migrated to the ABFS driver yet and continue to 
use the legacy WASB driver with FNS accounts.  

These customers face the following challenges: 
 *  They cannot leverage the optimizations and benefits of the ABFS driver.
 *  They need to deal with the compatibility issues should the files and 
folders were modified with the legacy WASB driver and the ABFS driver 
concurrently in a phased transition situation.
 *  There are differences for supported features for FNS and HNS over ABFS 
Driver
 *  In certain cases, they must perform a significant amount of re-work on 
their workloads to migrate to the ABFS driver, which is available only on HNS 
enabled accounts in a fully tested and supported scenario.

 ** 

*Deprecation plans for WASB* 

We are introducing a new feature that will enable the ABFS driver to support 
FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
enable customers to use the ABFS driver to interact with data stored in GPv2 
(General Purpose v2) storage accounts. 

With this feature, the customers who still use the legacy WASB driver will be 
able to migrate to the ABFS driver without much re-work on their workloads. 
They will however need to change the URIs from the WASB scheme to the ABFS 
scheme. 

Once ABFS driver has built FNS support capability to migrate WASB customers, 
WASB driver will be declared deprecated in OSS documentation and marked for 
removal in next major release. This will remove any ambiguity for new customer 
onboards as there will be only one Microsoft driver for Azure Storage and 
migrating customers will get SLA bound support for driver and service, which 
was not guaranteed over WASB.

 We anticipate that this feature will serve as a stepping stone for customers 
to move to HNS enabled accounts with the ABFS driver, which is our recommended 
stack for big data analytics on ADLS Gen2. 

*Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
account) with ABFS driver* *?*

This feature does not impact the existing customers who are using ADLS Gen2 
(HNS enabled account) with ABFS driver. 

They do not need to make any changes to their workloads or configurations. They 
will still enjoy the benefits of HNS, such as atomic operations, fine-grained 
access control, scalability, and performance. 

*Official recommendation*

Microsoft continues to recommend all Big Data and Analytics customers to use 
Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to 
optimize this scenario in future, we believe that this new option will help all 
those customers to transition to a supported scenario immediately, while they 
plan to ultimately move to ADLS Gen2 (HNS enabled account).

 *New Authentication options that a WASB to ABFS Driver migrating customer will 
get*

Below auth types that WASB provides will continue to work on the new FNS over 
ABFS Driver over configuration that accepts these SAS types (similar to WASB)
 * SharedKey
 * Account SAS
 * Service/Container SAS

Below authentication types that were not supported by WASB driver but supported 
by ABFS driver will continue to be available for new FNS over ABFS Driver
 * OAuth 2.0 Client Credentials
 * OAuth 2.0: Refresh Token
 * Azure Managed Identity
 * Custom OAuth 2.0 Token Provider

 

ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS and 
Directly SAS will continue to work only for HNS accounts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010

[jira] [Resolved] (HADOOP-19013) fs.getXattrs(path) for S3FS doesn't have x-amz-server-side-encryption-aws-kms-key-id header.

2024-05-15 Thread Mukund Thakur (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukund Thakur resolved HADOOP-19013.

Resolution: Fixed

> fs.getXattrs(path) for S3FS doesn't have 
> x-amz-server-side-encryption-aws-kms-key-id header.
> 
>
> Key: HADOOP-19013
> URL: https://issues.apache.org/jira/browse/HADOOP-19013
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> Once a path while uploading has been encrypted with SSE-KMS with a key id and 
> then later when we try to read the attributes of the same file, it doesn't 
> contain the key id information as an attribute. should we add it?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19177) TestS3ACachingBlockManager fails intermittently in Yetus

2024-05-15 Thread Mukund Thakur (Jira)
Mukund Thakur created HADOOP-19177:
--

 Summary: TestS3ACachingBlockManager fails intermittently in Yetus
 Key: HADOOP-19177
 URL: https://issues.apache.org/jira/browse/HADOOP-19177
 Project: Hadoop Common
  Issue Type: Test
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Mukund Thakur


{code:java}
[ERROR] 
org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet 
-- Time elapsed: 60.45 s <<< ERROR!
java.lang.IllegalStateException: waitForCaching: expected: 1, actual: 0, read 
errors: 0, caching errors: 1
at 
org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.waitForCaching(TestS3ACachingBlockManager.java:465)
at 
org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGetHelper(TestS3ACachingBlockManager.java:435)
at 
org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet(TestS3ACachingBlockManager.java:398)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Errors: 
[ERROR] 
org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingFailureOfGet
[ERROR]   Run 1: 
TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
 IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
errors: 1
[ERROR]   Run 2: 
TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
 IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
errors: 1
[ERROR]   Run 3: 
TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
 IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
errors: 1 {code}
Discovered in 
[https://github.com/apache/hadoop/pull/6646#issuecomment-2111558054] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19073) WASB: Fix connection leak in FolderRenamePending

2024-05-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19073.
-
Resolution: Fixed

> WASB: Fix connection leak in FolderRenamePending
> 
>
> Key: HADOOP-19073
> URL: https://issues.apache.org/jira/browse/HADOOP-19073
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.3.6
>Reporter: xy
>Assignee: xy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Fix connection leak in FolderRenamePending in getting bytes  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19176) S3A Xattr headers need hdfs-compatible prefix

2024-05-15 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19176:
---

 Summary: S3A Xattr headers need hdfs-compatible prefix
 Key: HADOOP-19176
 URL: https://issues.apache.org/jira/browse/HADOOP-19176
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.6, 3.4.0
Reporter: Steve Loughran


x3a xattr list needs a prefix compatible with hdfs or existing code which tries 
to copy attributes between stores can break

we need a prefix of {user/trusted/security/system/raw}.

now, problem: currently xattrs are used by the magic committer to propagate 
file size progress; renaming the prefix will break existing code. But as it's 
read only we could modify spark to look for both old and new values.

{code}

org.apache.hadoop.HadoopIllegalArgumentException: An XAttr name must be 
prefixed with user/trusted/security/system/raw, followed by a '.'
at org.apache.hadoop.hdfs.XAttrHelper.buildXAttr(XAttrHelper.java:77) 
at org.apache.hadoop.hdfs.DFSClient.setXAttr(DFSClient.java:2835) 
at 
org.apache.hadoop.hdfs.DistributedFileSystem$59.doCall(DistributedFileSystem.java:3106)
 
at 
org.apache.hadoop.hdfs.DistributedFileSystem$59.doCall(DistributedFileSystem.java:3102)
 
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.setXAttr(DistributedFileSystem.java:3115)
 
at org.apache.hadoop.fs.FileSystem.setXAttr(FileSystem.java:3097)

{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18958) Improve UserGroupInformation debug log

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18958.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

>  Improve UserGroupInformation debug log
> ---
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Assignee: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-18958) UserGroupInformation debug log improve

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened HADOOP-18958:
-

> UserGroupInformation debug log improve
> --
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19152) Do not hard code security providers.

2024-05-14 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze resolved HADOOP-19152.
-
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
 Release Note: Added a new conf 
"hadoop.security.crypto.jce.provider.auto-add" (default: true) to 
enable/disable auto-adding BouncyCastleProvider.  This change also avoid 
statically loading the BouncyCastleProvider class.
   Resolution: Fixed

The pull request is now merged.

> Do not hard code security providers.
> 
>
> Key: HADOOP-19152
> URL: https://issues.apache.org/jira/browse/HADOOP-19152
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> In order to support different security providers in different clusters, we 
> should not hard code a provider in our code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19175) update s3a committer docs

2024-05-14 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19175:
---

 Summary: update s3a committer docs
 Key: HADOOP-19175
 URL: https://issues.apache.org/jira/browse/HADOOP-19175
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


Update s3a committer docs

* declare that magic committer is stable and make it the recommended one
* show how to use new command "mapred successfile" to print the success file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19174) Tez and hive jobs fail due to google's protobuf 2.5.0 in classpath

2024-05-14 Thread Bilwa S T (Jira)
Bilwa S T created HADOOP-19174:
--

 Summary: Tez and hive jobs fail due to google's protobuf 2.5.0 in 
classpath
 Key: HADOOP-19174
 URL: https://issues.apache.org/jira/browse/HADOOP-19174
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Bilwa S T
Assignee: Bilwa S T


There are two issues here:

1. We are running tez 0.10.3 which uses hadoop 3.3.6 version. Tez has protobuf 
version 3.21.1

Below is the exception we get. This is due to protobuf-2.5.0 in our hadoop 
classpath

java.lang.IllegalAccessError: class 
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto tried to access 
private field com.google.protobuf.AbstractMessage.memoizedSize 
(org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto and 
com.google.protobuf.AbstractMessage are in unnamed module of loader 'app')
at 
org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.getSerializedSize(DAGProtos.java:21636)
at 
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
at org.apache.tez.common.TezUtils.writeConfInPB(TezUtils.java:170)
at org.apache.tez.common.TezUtils.createByteStringFromConf(TezUtils.java:83)
at 
org.apache.tez.common.TezUtils.createUserPayloadFromConf(TezUtils.java:101)
at org.apache.tez.dag.app.DAGAppMaster.serviceInit(DAGAppMaster.java:436)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at org.apache.tez.dag.app.DAGAppMaster$9.run(DAGAppMaster.java:2600)
at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at 
org.apache.tez.dag.app.DAGAppMaster.initAndStartAppMaster(DAGAppMaster.java:2597)
at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2384)
2024-04-18 16:27:54,741 [INFO] [shutdown-hook-0] |app.DAGAppMaster|: 
DAGAppMasterShutdownHook invoked
2024-04-18 16:27:54,743 [INFO] [shutdown-hook-0] |service.AbstractService|: 
Service org.apache.tez.dag.app.DAGAppMaster failed in state STOPPED
java.lang.NullPointerException: Cannot invoke 
"org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because 
"this.taskSchedulerManager" is null
at org.apache.tez.dag.app.DAGAppMaster.initiateStop(DAGAppMaster.java:2111)
at org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:2126)
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
at 
org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterShutdownHook.run(DAGAppMaster.java:2432)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
2024-04-18 16:27:54,744 [WARN] [Thread-2] |util.ShutdownHookManager|: 
ShutdownHook 'DAGAppMasterShutdownHook' failed, 
java.util.concurrent.ExecutionException: java.lang.NullPointerException: Cannot 
invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because 
"this.taskSchedulerManager" is null
java.util.concurrent.ExecutionException: java.lang.NullPointerException: Cannot 
invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because 
"this.taskSchedulerManager" is null
at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205)
at 
org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
at 
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
Caused by: java.lang.NullPointerException: Cannot invoke 
"org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because 
"this.taskSchedulerManager" is null
at org.apache.tez.dag.app.DAGAppMaster.initiateStop(DAGAppMaster.java:2111)
at org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:2126)
at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
at 
org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterShutdownHook.run(DAGAppMaster.java:2432)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.ja

[jira] [Resolved] (HADOOP-19170) Fixes compilation issues on Mac

2024-05-13 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HADOOP-19170.
--
Fix Version/s: 3.5.0
   Resolution: Fixed

> Fixes compilation issues on Mac
> ---
>
> Key: HADOOP-19170
> URL: https://issues.apache.org/jira/browse/HADOOP-19170
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: OS:  macOS Catalina 10.15.7
> compiler: clang 12.0.0
> cmake: 3.24.0
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> When I build hadoop-common native in Mac OS, I found this error:
> {code:java}
> /x/hadoop/hadoop-common-project/hadoop-common/src/main/native/src/exception.c:114:50:
>  error: function-like macro '__GLIBC_PREREQ' is not defined
> #if defined(__sun) || defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2, 32) {code}
> The reason is that Mac OS does not support glibc. And C conditional 
> compilation requires validation of all expressions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19173) Upgrade org.apache.derby:derby to 10.17.1.0

2024-05-13 Thread Shilun Fan (Jira)
Shilun Fan created HADOOP-19173:
---

 Summary: Upgrade org.apache.derby:derby to 10.17.1.0
 Key: HADOOP-19173
 URL: https://issues.apache.org/jira/browse/HADOOP-19173
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, common
Affects Versions: 3.5.0, 3.4.1
Reporter: Shilun Fan
Assignee: Shilun Fan


Upgrade org.apache.derby:derby to 10.17.1.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720

2024-05-13 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19172:
---

 Summary: Upgrade aws-java-sdk to 1.12.720
 Key: HADOOP-19172
 URL: https://issues.apache.org/jira/browse/HADOOP-19172
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, fs/s3
Affects Versions: 3.3.6, 3.4.0
Reporter: Steve Loughran


Update to the latest AWS SDK, to stop anyone worrying about the ion library CVE 
https://nvd.nist.gov/vuln/detail/CVE-2024-21634

This isn't exposed in the s3a client, but may be used downstream. 

on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19171) AWS v2: handle alternative forms of connection failure

2024-05-13 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19171:
---

 Summary: AWS v2: handle alternative forms of connection failure
 Key: HADOOP-19171
 URL: https://issues.apache.org/jira/browse/HADOOP-19171
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.6, 3.4.0
Reporter: Steve Loughran


We've had reports of network connection failures surfacing deeper in the stack 
where we don't convert to AWSApiCallTimeoutException so they aren't retried 
properly (retire connection and repeat)


{code}
Unable to execute HTTP request: Broken pipe (Write failed)
{code}


{code}
 Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed. (Service: Amazon S3; 
Status Code: 400; Error Code: RequestTimeout
{code}





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19170) Fixes compilation issues on non-Linux systems

2024-05-13 Thread Chenyu Zheng (Jira)
Chenyu Zheng created HADOOP-19170:
-

 Summary: Fixes compilation issues on non-Linux systems
 Key: HADOOP-19170
 URL: https://issues.apache.org/jira/browse/HADOOP-19170
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Chenyu Zheng
Assignee: Chenyu Zheng


When I build hadoop-common native in Mac OS, I found this error:
{code:java}
/x/hadoop/hadoop-common-project/hadoop-common/src/main/native/src/exception.c:114:50:
 error: function-like macro '__GLIBC_PREREQ' is not defined
#if defined(__sun) || defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2, 32) {code}
The reason is that Mac OS does not support glibc. And C conditional compilation 
requires validation of all expressions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19169) Hadoop: Upgrade @shore/bootstrap 3.3.5-shore.76

2024-05-13 Thread Sandeep Kumar (Jira)
Sandeep Kumar created HADOOP-19169:
--

 Summary: Hadoop: Upgrade @shore/bootstrap 3.3.5-shore.76
 Key: HADOOP-19169
 URL: https://issues.apache.org/jira/browse/HADOOP-19169
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Sandeep Kumar


Upgrade @shore/bootstrap 3.3.5-shore.76 to stable version



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19168) Upgrade Kafka Clients due to CVEs

2024-05-10 Thread Rohit Kumar (Jira)
Rohit Kumar created HADOOP-19168:


 Summary: Upgrade Kafka Clients due to CVEs
 Key: HADOOP-19168
 URL: https://issues.apache.org/jira/browse/HADOOP-19168
 Project: Hadoop Common
  Issue Type: Task
Reporter: Rohit Kumar


Upgrade Kafka Clients due to CVEs

CVE-2023-25194:- Affected versions of this package are vulnerable to 
Deserialization of Untrusted Data when there are gadgets in the 
{{{}classpath{}}}. The server will connect to the attacker's LDAP server and 
deserialize the LDAP response, which the attacker can use to execute java 
deserialization gadget chains on the Kafka connect server.
CVSS Score:- 8.8(High)
[https://nvd.nist.gov/vuln/detail/CVE-2023-25194] 

CVE-2021-38153

CVE-2018-17196

Insufficient Entropy

[https://security.snyk.io/package/maven/org.apache.kafka:kafka-clients] 

Upgrade Kafka-Clients to 3.4.0 or higher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19166) [DOC] Drop Migrating from Apache Hadoop 1.x to Apache Hadoop 2.x

2024-05-07 Thread Ayush Saxena (Jira)
Ayush Saxena created HADOOP-19166:
-

 Summary: [DOC] Drop Migrating from Apache Hadoop 1.x to Apache 
Hadoop 2.x
 Key: HADOOP-19166
 URL: https://issues.apache.org/jira/browse/HADOOP-19166
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ayush Saxena


Reading the docs, found this page, which is pretty irrelevant in current 
context or upcoming 3.x releases, can explore dropping it

https://apache.github.io/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19165) Explore dropping protobuf 2.5.0 from the distro

2024-05-07 Thread Ayush Saxena (Jira)
Ayush Saxena created HADOOP-19165:
-

 Summary: Explore dropping protobuf 2.5.0 from the distro
 Key: HADOOP-19165
 URL: https://issues.apache.org/jira/browse/HADOOP-19165
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Ayush Saxena


explore if protobuf-2.5.0 can be dropped from distro, it is a transitive 
dependency from HBase, but HBase doesn't use it in the code.

Check if it is the only one pulling it into the distro & if things break we 
exclude that, if none let get rid of it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen reopened HADOOP-18851:
-

Revert the previous commit which removes the synchronized keywords. Will have a 
new implementation using ReentrantReadWriteLock. 

> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap

[jira] [Created] (HADOOP-19164) Hadoop CLI MiniCluster is broken

2024-05-03 Thread Ayush Saxena (Jira)
Ayush Saxena created HADOOP-19164:
-

 Summary: Hadoop CLI MiniCluster is broken
 Key: HADOOP-19164
 URL: https://issues.apache.org/jira/browse/HADOOP-19164
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Ayush Saxena


Documentation is also broken & it doesn't work either

(https://apache.github.io/hadoop/hadoop-project-dist/hadoop-common/CLIMiniCluster.html)

*Fails with:*
{noformat}
Exception in thread "main" java.lang.NoClassDefFoundError: 
org/mockito/stubbing/Answer
at 
org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:2666)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:2680)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1510)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:989)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:588)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:530)
at 
org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:160)
at 
org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132)
at 
org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320)
Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
... 9 more{noformat}
{*}Command executed:{*}
{noformat}
bin/mapred minicluster -format{noformat}
*Documentation Issues:*
{noformat}
bin/mapred minicluster -rmport RM_PORT -jhsport JHS_PORT{noformat}

Without -format option it doesn't work the first time telling Namenode isn't 
formatted, So, this should be corrected.


{noformat}
2024-05-04 00:35:52,933 WARN namenode.FSNamesystem: Encountered exception 
loading fsimage
java.io.IOException: NameNode is not formatted.
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:253)
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19163) Upgrade protobuf version to 3.24.4

2024-05-03 Thread Bilwa S T (Jira)
Bilwa S T created HADOOP-19163:
--

 Summary: Upgrade protobuf version to 3.24.4
 Key: HADOOP-19163
 URL: https://issues.apache.org/jira/browse/HADOOP-19163
 Project: Hadoop Common
  Issue Type: Bug
  Components: hadoop-thirdparty
Reporter: Bilwa S T
Assignee: Bilwa S T






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19162) Add LzoCodec implementation based on aircompressor

2024-05-02 Thread L. C. Hsieh (Jira)
L. C. Hsieh created HADOOP-19162:


 Summary: Add LzoCodec implementation based on aircompressor
 Key: HADOOP-19162
 URL: https://issues.apache.org/jira/browse/HADOOP-19162
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: L. C. Hsieh


I remember due to license issue, Hadoop doesn't contain built-in LzoCodec. 
Users can choose to build and install Lzo codec like hadoop-lzo manually. Some 
implement LzoCodec based on other open source implementations like 
aircompressor. But it is somehow inconvenience to maintain it separately.

I'm wondering if we can add LzoCodec implementation based on aircompressor into 
Hadoop as default LzoCodec.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19161) S3A: support a comma separated list of performance flags

2024-05-02 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19161:
---

 Summary: S3A: support a comma separated list of performance flags
 Key: HADOOP-19161
 URL: https://issues.apache.org/jira/browse/HADOOP-19161
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 3.4.1
Reporter: Steve Loughran
Assignee: Steve Loughran


HADOOP-19072 shows we want to add more optimisations than that of HADOOP-18930.

* Extending the new optimisations to the existing option is brittle
* Adding explicit options for each feature gets complext fast.

Proposed
* A new class S3APerformanceFlags keeps all the flags
* it build this from a string[] of values, which can be extracted from 
getConf(),
* and it can also support a "*" option to mean "everything"
* this class can also be handed off to hasPathCapability() and do the right 
thing.

Proposed optimisations
* create file (we will hook up HADOOP-18930)
* mkdir (HADOOP-19072)
* delete (probe for parent path)
* rename (probe for source path)

We could think of more, with different names, later.
The goal is make it possible to strip out every HTTP request we do for 
safety/posix compliance, so applications have the option of turning off what 
they don't need.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19146.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.pro

[jira] [Resolved] (HADOOP-19151) Support configurable SASL mechanism

2024-04-29 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze resolved HADOOP-19151.
-
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

The pull request is now merged.

> Support configurable SASL mechanism
> ---
>
> Key: HADOOP-19151
> URL: https://issues.apache.org/jira/browse/HADOOP-19151
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Currently, the SASL mechanism is hard coded to DIGEST-MD5.  As mentioned in 
> HADOOP-14811, DIGEST-MD5 is known to be insecure; see 
> [rfc6331|https://datatracker.ietf.org/doc/html/rfc6331].
> In this JIRA, we will make the SASL mechanism configurable.  The default 
> mechanism will still be DIGEST-MD5 in order to maintain compatibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19150) Test ITestAbfsRestOperationException#testAuthFailException is broken.

2024-04-29 Thread Mukund Thakur (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukund Thakur resolved HADOOP-19150.

Fix Version/s: 3.4.1
   Resolution: Fixed

> Test ITestAbfsRestOperationException#testAuthFailException is broken. 
> --
>
> Key: HADOOP-19150
> URL: https://issues.apache.org/jira/browse/HADOOP-19150
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Mukund Thakur
>Assignee: Anuj Modi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> {code:java}
> intercept(Exception.class,
> () -> {
>   fs.getFileStatus(new Path("/"));
> }); {code}
> Intercept shouldn't be used as there are assertions in catch statements. 
>  
> CC [~ste...@apache.org]  [~anujmodi2021] [~asrani_anmol] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads

2024-04-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19159.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
   Resolution: Fixed

> Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
> --
>
> Key: HADOOP-19159
> URL: https://issues.apache.org/jira/browse/HADOOP-19159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Xi Chen
>Assignee: Xi Chen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> The description about `fs.s3a.committer.abort.pending.uploads` in the 
> _Concurrent Jobs writing to the same destination_ is not all correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads

2024-04-27 Thread Xi Chen (Jira)
Xi Chen created HADOOP-19159:


 Summary: Fix hadoop-aws document for 
fs.s3a.committer.abort.pending.uploads
 Key: HADOOP-19159
 URL: https://issues.apache.org/jira/browse/HADOOP-19159
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Xi Chen


The description about `fs.s3a.committer.abort.pending.uploads` in the 
_Concurrent Jobs writing to the same destination_ is not all correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19158) Support delegating ByteBufferPositionedReadable to vector reads

2024-04-25 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19158:
---

 Summary: Support delegating ByteBufferPositionedReadable to vector 
reads
 Key: HADOOP-19158
 URL: https://issues.apache.org/jira/browse/HADOOP-19158
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran


Make it easy for any stream with vector io to suppor

Specifically, 

ByteBufferPositionedReadable.readFully()

is exactly a single range read so is easy to read.

the simpler read() call which can return less isn't part of the vector API.
Proposed: invoke the readFully() but convert an EOFException to -1 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17647) Release Hadoop 3.3.1

2024-04-24 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HADOOP-17647.
--
  Assignee: Wei-Chiu Chuang
Resolution: Done

The release was published on June 15 2021.

> Release Hadoop 3.3.1
> 
>
> Key: HADOOP-17647
> URL: https://issues.apache.org/jira/browse/HADOOP-17647
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
>
> File this jira to track the release work of Hadoop 3.3.1
> Release dashboard: 
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12336122



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19107) Drop support for HBase v1

2024-04-24 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HADOOP-19107.
---
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Drop support for HBase v1
> -
>
> Key: HADOOP-19107
> URL: https://issues.apache.org/jira/browse/HADOOP-19107
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Drop support for Hbase V1 and make building Hbase v2 default.
> Dev List:
> [https://lists.apache.org/thread/vb2gh5ljwncbrmqnk0oflb8ftdz64hhs]
> https://lists.apache.org/thread/o88hnm7q8n3b4bng81q14vsj3fbhfx5w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19157) [ABFS] Filesystem contract tests to use methodPath for robust parallel test runs

2024-04-23 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19157:
---

 Summary: [ABFS] Filesystem contract tests to use methodPath for 
robust parallel test runs
 Key: HADOOP-19157
 URL: https://issues.apache.org/jira/browse/HADOOP-19157
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure, test
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran


hadoop-azure supports parallel test runs, but unlike hadoop-aws, the azure ones 
are parallelised across methods in the same test suites.

this can fail badly where contract tests have hard coded filenames and assume 
that they can use this across all test cases. Shows up when you are testing on 
a store with reduced IO capacity triggering retries and making some test cases 
slower

Fix: hadoop-common contract tests to use methodPath() names



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19102) [ABFS]: FooterReadBufferSize should not be greater than readBufferSize

2024-04-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19102.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> [ABFS]: FooterReadBufferSize should not be greater than readBufferSize
> --
>
> Key: HADOOP-19102
> URL: https://issues.apache.org/jira/browse/HADOOP-19102
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Pranav Saxena
>Assignee: Pranav Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> The method `optimisedRead` creates a buffer array of size `readBufferSize`. 
> If footerReadBufferSize is greater than readBufferSize, abfs will attempt to 
> read more data than the buffer array can hold, which causes an exception.
> Change: To avoid this, we will keep footerBufferSize = 
> min(readBufferSizeConfig, footerBufferSizeConfig)
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19156) ZooKeeper based state stores use different ZK address configs

2024-04-23 Thread liu bin (Jira)
liu bin created HADOOP-19156:


 Summary: ZooKeeper based state stores use different ZK address 
configs
 Key: HADOOP-19156
 URL: https://issues.apache.org/jira/browse/HADOOP-19156
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: liu bin


Currently, the Zookeeper-based state stores of RM, YARN Federation, and HDFS 
Federation use the same ZK address config `{{{}hadoop.zk.address`{}}}. But in 
our production environment, we hope that different services can use different 
ZKs to avoid mutual influence.

This jira adds separate ZK address configs for each service.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19155) Fix TestZKSignerSecretProvider failing unit test

2024-04-23 Thread kuper (Jira)
kuper created HADOOP-19155:
--

 Summary: Fix TestZKSignerSecretProvider failing unit test
 Key: HADOOP-19155
 URL: https://issues.apache.org/jira/browse/HADOOP-19155
 Project: Hadoop Common
  Issue Type: Test
  Components: auth
Affects Versions: 3.4.0
Reporter: kuper
 Attachments: 企业微信截图_4436de68-18c5-43bf-9382-4d9a853f7ef0.png, 
企业微信截图_ab901a4a-c0d4-4a20-a595-057cf648c30c.png, 
企业微信截图_fa5e7d54-b3a8-4ca3-8d4a-25fe493b4eb1.png

* {{TestZKSignerSecretProvider and 
}}{{{}TestRandomSignerSecretProvider{}}}}} unit test o{}}}ccasional failure
 * The reason was that the MockZKSignerSecretProvider class rollSecret method 
is {{synchronized}}
 * {{{}s{}}}ometimes verify (secretProvider, timeout (timeout). AtLeastOnce 
()). RollSecret () method first in RolloverSignerSecretProvider scheduler 
thread lock, this results in a timeout

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18958) UserGroupInformation debug log improve

2024-04-22 Thread wangzhihui (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangzhihui resolved HADOOP-18958.
-
Resolution: Not A Bug

> UserGroupInformation debug log improve
> --
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19154) upgrade bouncy castle to 1.78.1 due to CVEs

2024-04-19 Thread PJ Fanning (Jira)
PJ Fanning created HADOOP-19154:
---

 Summary: upgrade bouncy castle to 1.78.1 due to CVEs
 Key: HADOOP-19154
 URL: https://issues.apache.org/jira/browse/HADOOP-19154
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Reporter: PJ Fanning


[https://www.bouncycastle.org/releasenotes.html#r1rv78]

There is a v1.78.1 release but no notes for it yet.

For v1.78
h3. 2.1.5 Security Advisories.

Release 1.78 deals with the following CVEs:
 * CVE-2024-29857 - Importing an EC certificate with specially crafted F2m 
parameters can cause high CPU usage during parameter evaluation.
 * CVE-2024-30171 - Possible timing based leakage in RSA based handshakes due 
to exception processing eliminated.
 * CVE-2024-30172 - Crafted signature and public key can be used to trigger an 
infinite loop in the Ed25519 verification code.
 * CVE-2024-301XX - When endpoint identification is enabled and an SSL socket 
is not created with an explicit hostname (as happens with HttpsURLConnection), 
hostname verification could be performed against a DNS-resolved IP address. 
This has been fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19130) FTPFileSystem rename with full qualified path broken

2024-04-17 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HADOOP-19130.
---
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> FTPFileSystem rename with full qualified path broken
> 
>
> Key: HADOOP-19130
> URL: https://issues.apache.org/jira/browse/HADOOP-19130
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.20.2, 3.3.3, 3.3.4, 3.3.6
>Reporter: shawn
>Assignee: shawn
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: image-2024-03-27-09-59-12-381.png, 
> image-2024-03-28-09-58-19-721.png
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
>    When use fs shell to put/rename file in ftp server with full qualified 
> path , it always get "Input/output error"(eg. 
> [ftp://user:password@localhost/pathxxx]), the reason is that 
> changeWorkingDirectory command underneath is being passed a string with 
> [file://|file:///] uri prefix which will not be understand by ftp server
> !image-2024-03-27-09-59-12-381.png|width=948,height=156!
>  
> in our case, after 
> client.changeWorkingDirectory("ftp://mytest:myt...@10.5.xx.xx/files;)
> executed, the workingDirectory of ftp server is still "/", which is 
> incorrect(not understand by ftp server)
> !image-2024-03-28-09-58-19-721.png|width=745,height=431!
> the solution should be pass 
> absoluteSrc.getParent().toUri().getPath().toString to avoid
> [file://|file:///] uri prefix, like this: 
> {code:java}
> --- a/FTPFileSystem.java
> +++ b/FTPFileSystem.java
> @@ -549,15 +549,15 @@ public class FTPFileSystem extends FileSystem {
>        throw new IOException("Destination path " + dst
>            + " already exist, cannot rename!");
>      }
> -    String parentSrc = absoluteSrc.getParent().toUri().toString();
> -    String parentDst = absoluteDst.getParent().toUri().toString();
> +    URI parentSrc = absoluteSrc.getParent().toUri();
> +    URI parentDst = absoluteDst.getParent().toUri();
>      String from = src.getName();
>      String to = dst.getName();
> -    if (!parentSrc.equals(parentDst)) {
> +    if (!parentSrc.toString().equals(parentDst.toString())) {
>        throw new IOException("Cannot rename parent(source): " + parentSrc
>            + ", parent(destination):  " + parentDst);
>      }
> -    client.changeWorkingDirectory(parentSrc);
> +    client.changeWorkingDirectory(parentSrc.getPath().toString());
>      boolean renamed = client.rename(from, to);
>      return renamed;
>    }{code}
> already related issue  as follows 
> https://issues.apache.org/jira/browse/HADOOP-8653
> I create this issue and add related unit test.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19153) hadoop-common still exports logback as a transitive dependency

2024-04-17 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19153:
---

 Summary: hadoop-common still exports logback as a transitive 
dependency
 Key: HADOOP-19153
 URL: https://issues.apache.org/jira/browse/HADOOP-19153
 Project: Hadoop Common
  Issue Type: Bug
  Components: build, common
Affects Versions: 3.4.0
Reporter: Steve Loughran


Even though HADOOP-19084 set out to stop it, somehow ZK's declaration of a 
logback dependency is still contaminating the hadoop-common dependency graph, 
so causing problems downstream.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19152) Do not hard code security providers.

2024-04-16 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created HADOOP-19152:
---

 Summary: Do not hard code security providers.
 Key: HADOOP-19152
 URL: https://issues.apache.org/jira/browse/HADOOP-19152
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Tsz-wo Sze
Assignee: Tsz-wo Sze


In order to support different security providers in different clusters, we 
should not hard code a provider in our code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19151) Support configurable SASL mechanism

2024-04-16 Thread Tsz-wo Sze (Jira)
Tsz-wo Sze created HADOOP-19151:
---

 Summary: Support configurable SASL mechanism
 Key: HADOOP-19151
 URL: https://issues.apache.org/jira/browse/HADOOP-19151
 Project: Hadoop Common
  Issue Type: Improvement
  Components: security
Reporter: Tsz-wo Sze
Assignee: Tsz-wo Sze


Currently, the SASL mechanism is hard coded to DIGEST-MD5.  As mentioned in 
HADOOP-14811, DIGEST-MD5 is known to be insecure; see 
[rfc6331|https://datatracker.ietf.org/doc/html/rfc6331].

In this JIRA, we will make the SASL mechanism configurable.  The default 
mechanism will still be DIGEST-MD5 in order to maintain compatibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19150) Test ITestAbfsRestOperationException#testAuthFailException is broken.

2024-04-16 Thread Mukund Thakur (Jira)
Mukund Thakur created HADOOP-19150:
--

 Summary: Test 
ITestAbfsRestOperationException#testAuthFailException is broken. 
 Key: HADOOP-19150
 URL: https://issues.apache.org/jira/browse/HADOOP-19150
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Mukund Thakur


{code:java}
intercept(Exception.class,
() -> {
  fs.getFileStatus(new Path("/"));
}); {code}
Intercept shouldn't be used as there are assertions in catch statements. 

 

CC [~ste...@apache.org]  [~anujmodi2021] [~asrani_anmol] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19149) ABFS: Implement ThreadLocal for ObjectMapper in AzureHttpOperation via config option with static shared instance as an alternative.

2024-04-16 Thread Mukund Thakur (Jira)
Mukund Thakur created HADOOP-19149:
--

 Summary: ABFS: Implement ThreadLocal for ObjectMapper in 
AzureHttpOperation via config option with static shared instance as an 
alternative.
 Key: HADOOP-19149
 URL: https://issues.apache.org/jira/browse/HADOOP-19149
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.4.0
Reporter: Mukund Thakur
Assignee: Mukund Thakur


While doing internal tests on Hive TPCDS queries we have seen many instances of 
ObjectMapper have been created in an Application Master thus sharing a thread 
local object mapper instances will improve the performance.  

 

CC [~ste...@apache.org]  [~harshit.gupta] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298

2024-04-15 Thread Brahma Reddy Battula (Jira)
Brahma Reddy Battula created HADOOP-19148:
-

 Summary: Update solr from 8.11.2 to 8.11.3 to address 
CVE-2023-50298
 Key: HADOOP-19148
 URL: https://issues.apache.org/jira/browse/HADOOP-19148
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Reporter: Brahma Reddy Battula


Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19106) [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE

2024-04-13 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi resolved HADOOP-19106.

   Fix Version/s: 3.4.1
Hadoop Flags: Reviewed
Release Note: https://github.com/apache/hadoop/pull/6676
Target Version/s: 3.4.1
  Resolution: Fixed

[HADOOP-19129: [ABFS] Test Fixes and Test Script Bug Fixes by anujmodi2021 · 
Pull Request #6676 · apache/hadoop 
(github.com)|https://github.com/apache/hadoop/pull/6676]

> [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE
> -
>
> Key: HADOOP-19106
> URL: https://issues.apache.org/jira/browse/HADOOP-19106
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Assignee: Anuj Modi
>Priority: Major
> Fix For: 3.4.1
>
>
> When below config set to true all of the tests fails else it skips.
> 
>     fs.azure.test.namespace.enabled
>     true
> 
>  
> [*ERROR*] 
> testOpenFileAuthorized(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization)
>   Time elapsed: 0.064 s  <<< ERROR!
> java.lang.NullPointerException
>  at 
> org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.runTest(ITestAzureBlobFileSystemAuthorization.java:273)
>  at 
> org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.testOpenFileAuthorized(ITestAzureBlobFileSystemAuthorization.java:132)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19129) ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite

2024-04-13 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi resolved HADOOP-19129.

Fix Version/s: 3.4.1
 Hadoop Flags: Reviewed
 Release Note: https://github.com/apache/hadoop/pull/6676
   Resolution: Fixed

[HADOOP-19129: [ABFS] Test Fixes and Test Script Bug Fixes by anujmodi2021 · 
Pull Request #6676 · apache/hadoop 
(github.com)|https://github.com/apache/hadoop/pull/6676]

> ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite
> 
>
> Key: HADOOP-19129
> URL: https://issues.apache.org/jira/browse/HADOOP-19129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0, 3.4.1
>Reporter: Anuj Modi
>Assignee: Anuj Modi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> Test Script used by ABFS to validate changes has following two issues:
>  # When there are a lot of test failures or when error message of any failing 
> test becomes very large, the regex used today to filter test results does not 
> work as expected and fails to report all the failing tests.
> To resolve this, we have come up with new regex that will only target one 
> line test names for reporting them into aggregated test results.
>  # While running the test suite for different combinations of Auth type and 
> account type, we add the combination specific configs first and then include 
> the account specific configs in core-site.xml file. This will override the 
> combination specific configs like auth type if the same config is present in 
> account specific config file. To avoid this, we will first include the 
> account specific configs and then add the combination specific configs.
> Due to above bug in test script, some test failures in ABFS were not getting 
> our attention. This PR also targets to resolve them. Following are the tests 
> fixed:
>  # ITestAzureBlobFileSystemAppend.testCloseOfDataBlockOnAppendComplete(): It 
> was failing only when append blobs were enabled. In case of append blobs we 
> were not closing the active block on outputstrea,close() due to which 
> block.close() was not getting called and assertions around it were failing. 
> Fixed by updating the production code to close the active block on flush.
>  # ITestAzureBlobFileSystemAuthorization: Tests in this class works with an 
> existing remote filesystem instead of creating a new file system instance. 
> For this they require file system configured in account settings using 
> following config: "fs.contract.test.fs.abfs". Tests weref ailing with NPE 
> when this config was not present. Updated code to skip thsi test if required 
> config is not present.
>  # ITestAbfsClient.testListPathWithValueGreaterThanServerMaximum(): Test was 
> failing Intermittently only for HNS enabled accounts. Test wants to assert 
> that client.listPath() does not return more objects than what is configured 
> in maxListResults. Assertions should be that number of objects returned could 
> be less than expected as server might end up returning even lesser due to 
> partition splits along with a continuation token.
>  # ITestGetNameSpaceEnabled.testGetIsNamespaceEnabledWhenConfigIsTrue(): Fail 
> when "fs.azure.test.namespace.enabled" config is missing. Ignore the test if 
> config is missing.
>  # ITestGetNameSpaceEnabled.testGetIsNamespaceEnabledWhenConfigIsFalse(): 
> Fail when "fs.azure.test.namespace.enabled" config is missing. Ignore the 
> test if config is missing.
>  # ITestGetNameSpaceEnabled.testNonXNSAccount(): Fail when 
> "fs.azure.test.namespace.enabled" config is missing. Ignore the test if 
> config is missing.
>  # ITestAbfsStreamStatistics.testAbfsStreamOps: Fails when 
> "fs.azure.test.appendblob.enabled" is set to true. Test wanted to assert that 
> number of read operations can be more in case of append blobs as compared to 
> normal blob because of automatic flush. It could be same as that of normal 
> blob as well.
>  # ITestAzureBlobFileSystemCheckAccess.testCheckAccessForAccountWithoutNS: 
> Fails for FNS Account only when following config is present:  
> fs.azure.account.hns.enabled". Failure is because test wants to assert that 
> when driver does not know if the account is HNS enabled or not it makes a 
> server call and fails. But above config is letting driver know the account 
> type and skipping the head call. Remove these configs from the test specific 
> configurations and no

[jira] [Resolved] (HADOOP-19110) ITestExponentialRetryPolicy failing in branch-3.4

2024-04-13 Thread Anuj Modi (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anuj Modi resolved HADOOP-19110.

   Fix Version/s: 3.4.1
Target Version/s: 3.4.1
  Resolution: Fixed

> ITestExponentialRetryPolicy failing in branch-3.4
> -
>
> Key: HADOOP-19110
> URL: https://issues.apache.org/jira/browse/HADOOP-19110
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Assignee: Anuj Modi
>Priority: Major
> Fix For: 3.4.1
>
>
> {code:java}
> [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: 
> 91.416 s <<< FAILURE! - in 
> org.apache.hadoop.fs.azurebfs.services.ITestExponentialRetryPolicy
> [ERROR] 
> testThrottlingIntercept(org.apache.hadoop.fs.azurebfs.services.ITestExponentialRetryPolicy)
>   Time elapsed: 0.622 s  <<< ERROR!
> Failure to initialize configuration for dummy.dfs.core.windows.net key 
> ="null": Invalid configuration value detected for fs.azure.account.key
>   at 
> org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:53)
>   at 
> org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:646)
>   at 
> org.apache.hadoop.fs.azurebfs.services.ITestAbfsClient.createTestClientFromCurrentContext(ITestAbfsClient.java:339)
>   at 
> org.apache.hadoop.fs.azurebfs.services.ITestExponentialRetryPolicy.testThrottlingIntercept(ITestExponentialRetryPolicy.java:106)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19146:
-

 Summary: noaa-cors-pds bucket access with global endpoint fails
 Key: HADOOP-19146
 URL: https://issues.apache.org/jira/browse/HADOOP-19146
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Viraj Jasani


All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket 
level. If global endpoint is configured (e.g. us-west-2), they fail to access 
to bucket.

 

Sample error:
{code:java}
org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
response to region [us-east-1].  This likely indicates that the S3 region 
configured in fs.s3a.endpoint.region does not match the AWS region containing 
the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, 
Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
    at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
    at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
    at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
 {code}
{code:java}
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: 
S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
    at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
    ...
    ...
    ...
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53

[jira] [Resolved] (HADOOP-19079) HttpExceptionUtils to check that loaded class is really an exception before instantiation

2024-04-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19079.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
   Resolution: Fixed

> HttpExceptionUtils to check that loaded class is really an exception before 
> instantiation
> -
>
> Key: HADOOP-19079
> URL: https://issues.apache.org/jira/browse/HADOOP-19079
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common, security
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> It can be dangerous taking class names as inputs from HTTP messages even if 
> we control the source. Issue is in HttpExceptionUtils in hadoop-common 
> (validateResponse method).
> I can provide a PR that will highlight the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19096) [ABFS] Enhancing Client-Side Throttling Metrics Updation Logic

2024-04-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19096.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> [ABFS] Enhancing Client-Side Throttling Metrics Updation Logic
> --
>
> Key: HADOOP-19096
> URL: https://issues.apache.org/jira/browse/HADOOP-19096
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.1
>Reporter: Anuj Modi
>Assignee: Anuj Modi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> ABFS has a client-side throttling mechanism which works on the metrics 
> collected from past requests made. I requests are getting failed due to 
> throttling at server, we update our metrics and client side backoff is 
> calculated based on those metrics.
> This PR enhances the logic to decide which requests should be considered to 
> compute client side backoff interval as follows:
> For each request made by ABFS driver, we will determine if they should 
> contribute to Client-Side Throttling based on the status code and result:
>  # Status code in 2xx range: Successful Operations should contribute.
>  # Status code in 3xx range: Redirection Operations should not contribute.
>  # Status code in 4xx range: User Errors should not contribute.
>  # Status code is 503: Throttling Error should contribute only if they are 
> due to client limits breach as follows:
>  ## 503, Ingress Over Account Limit: Should Contribute
>  ## 503, Egress Over Account Limit: Should Contribute
>  ## 503, TPS Over Account Limit: Should Contribute
>  ## 503, Other Server Throttling: Should not Contribute.
>  # Status code in 5xx range other than 503: Should not Contribute.
>  # IOException and UnknownHostExceptions: Should not Contribute.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19098.
-
Resolution: Fixed

> Vector IO: consistent specified rejection of overlapping ranges
> ---
>
> Key: HADOOP-19098
> URL: https://issues.apache.org/jira/browse/HADOOP-19098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Related to PARQUET-2171 q: "how do you deal with overlapping ranges?"
> I believe s3a rejects this, but the other impls may not.
> Proposed
> FS spec to say 
> * "overlap triggers IllegalArgumentException". 
> * special case: 0 byte ranges may be short circuited to return empty buffer 
> even without checking file length etc.
> Contract tests to validate this
> (+ common helper code to do this).
> I'll copy the validation stuff into the parquet PR for consistency with older 
> releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19101) Vectored Read into off-heap buffer broken in fallback implementation

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19101.
-
Fix Version/s: 3.3.9
   3.4.1
   Resolution: Fixed

> Vectored Read into off-heap buffer broken in fallback implementation
> 
>
> Key: HADOOP-19101
> URL: https://issues.apache.org/jira/browse/HADOOP-19101
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> {{VectoredReadUtils.readInDirectBuffer()}} always starts off reading at 
> position zero even when the range is at a different offset. As a result: you 
> can get incorrect information.
> Thanks for this is straightforward: we pass in a FileRange and use its offset 
> as the starting position.
> However, this does mean that all shipping releases 3.3.5-3.4.0 cannot safely 
> read vectorIO into direct buffers through HDFS, ABFS or GCS. Note that we 
> have never seen this in production because the parquet and ORC libraries both 
> read into on-heap storage.
> Those libraries needs to be audited to make sure that they never attempt to 
> read into off-heap DirectBuffers. This is a bit trickier than you would think 
> because an allocator is passed in. For PARQUET-2171 we will 
> * only invoke the API on streams which explicitly declare their support for 
> the API (so fallback in parquet itself)
> * not invoke when direct buffer allocation is in use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19109) Fix metrics description

2024-04-10 Thread Xiaobao Wu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobao Wu resolved HADOOP-19109.
-
Resolution: Not A Problem

> Fix metrics description
> ---
>
> Key: HADOOP-19109
> URL: https://issues.apache.org/jira/browse/HADOOP-19109
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.0, 3.3.4
>Reporter: Xiaobao Wu
>Priority: Minor
>  Labels: pull-request-available
>
> This description of the RpcLockWaitTimeNumOps  metrics seems to be incorrect:
> {code:java}
> | `RpcQueueTimeNumOps` | Total number of RPC calls |
> | `RpcQueueTimeAvgTime` | Average queue time in milliseconds |
> | `RpcLockWaitTimeNumOps` | Total number of RPC calls (same as 
> RpcQueueTimeNumOps) |{code}
> I think the description of this metrics should be more clear:
> {code:java}
> | `RpcLockWaitTimeNumOps` | Total number of waiting for lock acquisition 
> |{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18135) Produce Windows binaries of Hadoop

2024-04-09 Thread Gautham Banasandra (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gautham Banasandra resolved HADOOP-18135.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

Merged PR [https://github.com/apache/hadoop/pull/6673] to trunk.

> Produce Windows binaries of Hadoop
> --
>
> Key: HADOOP-18135
> URL: https://issues.apache.org/jira/browse/HADOOP-18135
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.4.0
> Environment: Windows 10
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> We currently only provide Linux libraries and binaries. We need to provide 
> the same for Windows. We need to port the [create-release 
> script|https://github.com/apache/hadoop/blob/5f9932acc4fa2b36a3005e587637c53f2da1618d/dev-support/bin/create-release]
>  to run on Windows and produce the Windows binaries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19145) Software Architecture Document

2024-04-06 Thread Levon Khorasandzhian (Jira)
Levon Khorasandzhian created HADOOP-19145:
-

 Summary: Software Architecture Document
 Key: HADOOP-19145
 URL: https://issues.apache.org/jira/browse/HADOOP-19145
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation
Reporter: Levon Khorasandzhian
 Attachments: Apache_Hadoop_SAD.pdf

We (@lkhorasandzhian & @vacherkasskiy) have prepared features for 
documentation. This attached Software Architecture Document is very useful for 
new contributors and developers to get acquainted with enormous system in a 
short time. Currently it's only in Russian, but if you're interested in such 
files we can translate it in English.
There are no changes in code, only adding new documentation files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19135) Remove Jcache 1.0-alpha

2024-04-05 Thread Shilun Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan resolved HADOOP-19135.
-
Fix Version/s: 3.5.0
   3.4.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Remove Jcache 1.0-alpha
> ---
>
> Key: HADOOP-19135
> URL: https://issues.apache.org/jira/browse/HADOOP-19135
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> In YARN Federation, we use JCache. The version of JCache has not been 
> maintained for a long time. We directly use ECache instead of JCache in 
> YARN-11663, so we can remove JCache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19144) S3A prefetching to support Vector IO

2024-04-04 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19144:
---

 Summary: S3A prefetching to support Vector IO
 Key: HADOOP-19144
 URL: https://issues.apache.org/jira/browse/HADOOP-19144
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


Add explicit support for vector IO in s3a prefetching stream.

* if a range is in 1+ cached block, it SHALL be read from cache and returned
* if a range is not in cache : TBD
* If a range is partially in cache: TBD

these are the same decisions that abfs has to make: should the client 
fetch/cache block or just do one or more GET requests

A big issue is: does caching of data fetched in a range request make any sense 
at all? Or more specifically: does fetching the blocks in which range requests 
are found make sense

Simply going to the store is a lot simpler



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19141) Update VectorIO default values consistently

2024-04-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved HADOOP-19141.

Fix Version/s: 3.3.7
   3.4.1
   Resolution: Fixed

> Update VectorIO default values consistently
> ---
>
> Key: HADOOP-19141
> URL: https://issues.apache.org/jira/browse/HADOOP-19141
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.4.1
>Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.7, 3.4.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19143) Upgrade commons-cli to 1.6.0.

2024-04-04 Thread Shilun Fan (Jira)
Shilun Fan created HADOOP-19143:
---

 Summary: Upgrade commons-cli to 1.6.0.
 Key: HADOOP-19143
 URL: https://issues.apache.org/jira/browse/HADOOP-19143
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, common
Affects Versions: 3.5.0, 3.4.1
Reporter: Shilun Fan
Assignee: Shilun Fan


commons-cli can be upgraded to 1.6.0, I will try to upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19142) DfsRouterAdmin RefreshCallQueue fails when authorization is enabled

2024-04-03 Thread Ananya Singh (Jira)
Ananya Singh created HADOOP-19142:
-

 Summary: DfsRouterAdmin RefreshCallQueue fails when authorization 
is enabled
 Key: HADOOP-19142
 URL: https://issues.apache.org/jira/browse/HADOOP-19142
 Project: Hadoop Common
  Issue Type: Bug
  Components: ipc
Affects Versions: 3.3.6
Reporter: Ananya Singh
Assignee: Ananya Singh






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19141) Update VectorIO default values consistently

2024-04-03 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created HADOOP-19141:
--

 Summary: Update VectorIO default values consistently
 Key: HADOOP-19141
 URL: https://issues.apache.org/jira/browse/HADOOP-19141
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/s3
Affects Versions: 3.4.1
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19140) [ABFS, S3A] Add IORateLimiter api to hadoop common

2024-04-03 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19140:
---

 Summary: [ABFS, S3A] Add IORateLimiter api to hadoop common
 Key: HADOOP-19140
 URL: https://issues.apache.org/jira/browse/HADOOP-19140
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/azure, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran


Create a rate limiter API in hadoop common which code (initially, manifest 
committer, bulk delete).. can request iO capacity for a specific operation.

this can be exported by filesystems so support shared rate limiting across all 
threads

pulled from HADOOP-19093 PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19139) [ABFS]: No GetPathStatus call for opening AbfsInputStream

2024-04-03 Thread Pranav Saxena (Jira)
Pranav Saxena created HADOOP-19139:
--

 Summary: [ABFS]: No GetPathStatus call for opening AbfsInputStream
 Key: HADOOP-19139
 URL: https://issues.apache.org/jira/browse/HADOOP-19139
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Reporter: Pranav Saxena
Assignee: Pranav Saxena


Read API gives contentLen and etag of the path. This information would be used 
in future calls on that inputStream. Prior information of eTag is of not much 
importance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19138) CSE-KMS S3A: Support for InstructionFile to store ECEK meta info

2024-04-03 Thread Vikas Kumar (Jira)
Vikas Kumar created HADOOP-19138:


 Summary: CSE-KMS S3A: Support for InstructionFile to store ECEK 
meta info
 Key: HADOOP-19138
 URL: https://issues.apache.org/jira/browse/HADOOP-19138
 Project: Hadoop Common
  Issue Type: New Feature
  Components: command, tools
Reporter: Vikas Kumar


{*}Task{*}: Support for InstructionFile to store ECEK meta info 

*Current implementation/Context:*  

Hadoop-aws supports CSE-KMS. During CSE, key encryption info needs to be kept 
somewhere. AWS SDK supports two ways:
 # *S3 Object's metadata* : Current integration in haddop-aws only supports 
this approach.
 ## But S3 metadata has limitation of 2 KB size.
 ## Also, metadata can not be updated independently. It would be complete 
object read/write operation even if we only need to change the metadata.  
 # *Instruction file approach:* It's a small file containing meta-info in the 
same bucket at the same location. This approach needs one extra trip to S3 
Read/Write operation but could be useful if business needs frequent metadata 
changes.

*Use case:* to implement KMS RE-ENCRYPT, where only CEK(DEK) needs to be 
encrypted with new key material. Here instruction file approach could be useful.

Plus there could be many other use cases based on different business needs.

*My analysis:* I tried to enable this by setting 
*CryptoStorageMode.InstructionFile* in 

CryptoConfigurationV2 while building AmazonS3EncryptionClientV2Builder. 

Note: ObjectMetadata is the default value.

{*}Result{*}: Write operation worked but read failed due to missing instruction 
file.

*RCA:* On debugging, I found following:

On put request, say myfile.txt : 
 * First , S3AFileSystem writes the file to S3 like *myfile.txt_COPYING_*
 * Second, it writes the corresponding instruction file as  
*myfile.txt_COPYING_.instruction*
 * Third, it calls rename.
 ** Rename here means copy the file bytes to *myfile.txt and*
 ** *delete the* *myfile.txt_COPYING*
 * Here problem occurs, 
 ** AmazonS3EncryptionClientV2 class, after deleting any file it looks for 
corresponding instruction file and if found it deletes that one also. As a 
result, it deletes *myfile.txt_COPYING_.instruction* as well.

Related  Code:

com.amazonaws.services.s3.AmazonS3EncryptionClientV2.deleteObject() // part of 
aws sdk bundle

*Possible solution:* S3AFileSystem (part of hadoop-aws) needs to be updated to 
first rename the instruction file , then the original file. This way deletion 
of instruction file can be avoided.

It also requires config changes to take Objemetadata/InstructionFile as config 
parameter.

Let's discuss if we have any better solution and can be incorporated.

Once we agree on one common solution, I can work on implementation part.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19123) Update commons-configuration2 to 2.10.1 due to CVE

2024-04-02 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HADOOP-19123.
---
Fix Version/s: 3.5.0
   3.4.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Update commons-configuration2 to 2.10.1 due to CVE
> --
>
> Key: HADOOP-19123
> URL: https://issues.apache.org/jira/browse/HADOOP-19123
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> https://github.com/advisories/GHSA-9w38-p64v-xpmv



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19115) upgrade to nimbus-jose-jwt 9.37.2 due to CVE

2024-04-02 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19115.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
 Assignee: PJ Fanning
   Resolution: Fixed

> upgrade to nimbus-jose-jwt 9.37.2 due to CVE
> 
>
> Key: HADOOP-19115
> URL: https://issues.apache.org/jira/browse/HADOOP-19115
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, CVE
>Affects Versions: 3.4.0, 3.5.0
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> https://github.com/advisories/GHSA-gvpg-vgmx-xg6w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19137) [ABFS]:Extra getAcl call while calling first API of FileSystem

2024-04-02 Thread Pranav Saxena (Jira)
Pranav Saxena created HADOOP-19137:
--

 Summary: [ABFS]:Extra getAcl call while calling first API of 
FileSystem
 Key: HADOOP-19137
 URL: https://issues.apache.org/jira/browse/HADOOP-19137
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.4.0
Reporter: Pranav Saxena
Assignee: Pranav Saxena


Store doesn't flow in the namespace information to the client. 

In https://github.com/apache/hadoop/pull/3440, getIsNamespaceEnabled is added 
in client methods which checks if namespace information is there or not, and if 
not there, it will make getAcl call and set the field. Once the field is set, 
it would be used in future getIsNamespaceEnabled method calls for a given 
AbfsClient.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19136) Upgrade commons-io to 2.15.0

2024-04-01 Thread Shilun Fan (Jira)
Shilun Fan created HADOOP-19136:
---

 Summary: Upgrade commons-io to 2.15.0
 Key: HADOOP-19136
 URL: https://issues.apache.org/jira/browse/HADOOP-19136
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.4.1
Reporter: Shilun Fan
Assignee: Shilun Fan


commons-io can be upgraded from 2.14.0 to 2.15.0, try to upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19135) Remove Jcache 1.0-alpha

2024-04-01 Thread Shilun Fan (Jira)
Shilun Fan created HADOOP-19135:
---

 Summary: Remove Jcache 1.0-alpha
 Key: HADOOP-19135
 URL: https://issues.apache.org/jira/browse/HADOOP-19135
 Project: Hadoop Common
  Issue Type: Improvement
  Components: common
Affects Versions: 3.5.0, 3.4.1
Reporter: Shilun Fan
Assignee: Shilun Fan


In YARN Federation, we use JCache. The version of JCache has not been 
maintained for a long time. We directly use ECache instead of JCache in 
YARN-11663, so we can remove JCache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19077) Remove use of javax.ws.rs.core.HttpHeaders

2024-04-01 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HADOOP-19077.
---
Fix Version/s: 3.4.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Remove use of javax.ws.rs.core.HttpHeaders
> --
>
> Key: HADOOP-19077
> URL: https://issues.apache.org/jira/browse/HADOOP-19077
> Project: Hadoop Common
>  Issue Type: Task
>  Components: io
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.1
>
>
> One step towards removing Hadoop's dependence on Jersey1 and jsr311-api.
> We have other classes where we can get HTTP header names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19134) use StringBuilder instead of StringBuffer

2024-03-30 Thread PJ Fanning (Jira)
PJ Fanning created HADOOP-19134:
---

 Summary: use StringBuilder instead of StringBuffer
 Key: HADOOP-19134
 URL: https://issues.apache.org/jira/browse/HADOOP-19134
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: PJ Fanning


StringBuilder is basically the same as StringBuffer but doesn't use 
synchronized. String appending rarely needs locking like this.

There are some public and package private APIs that use StringBuffers as input 
or return types - I have left these alone for compatibility reasons.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19024) Use bouncycastle jdk18 1.77

2024-03-30 Thread Ayush Saxena (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena resolved HADOOP-19024.
---
Fix Version/s: 3.5.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Use bouncycastle jdk18 1.77
> ---
>
> Key: HADOOP-19024
> URL: https://issues.apache.org/jira/browse/HADOOP-19024
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> They have stopped patching the JDK 1.5 jars that Hadoop uses (see 
> https://issues.apache.org/jira/browse/HADOOP-18540).
> The new artifacts have similar names - but the names are like bcprov-jdk18on 
> as opposed to bcprov-jdk15on.
> CVE-2023-33201 is an example of a security issue that seems only to be fixed 
> in the JDK 1.8 artifacts (ie no JDK 1.5 jar has the fix).
> https://www.bouncycastle.org/releasenotes.html#r1rv77 latest current release 
> but the CVE was fixed in 1.74.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19133) "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI property

2024-03-29 Thread Attila Doroszlai (Jira)
Attila Doroszlai created HADOOP-19133:
-

 Summary: "No test bucket" error in ITestS3AContractVectoredRead if 
provided via CLI property
 Key: HADOOP-19133
 URL: https://issues.apache.org/jira/browse/HADOOP-19133
 Project: Hadoop Common
  Issue Type: Bug
  Components: test, tools
Reporter: Attila Doroszlai


ITestS3AContractVectoredRead fails with {{NullPointerException: No test 
bucket}} if test bucket is defined as {{-Dtest.fs.s3a.name=...}} via CLI , not 
in {{auth-keys.xml}}.  The same setup works for other S3A contract tests.  
Tested on 3.3.6.

{code:title=src/test/resources/auth-keys.xml}

  
fs.s3a.endpoint
${test.fs.s3a.endpoint}
  
  
fs.contract.test.fs.s3a
${test.fs.s3a.name}
  

{code}

{code}
export AWS_ACCESS_KEY_ID=''
export AWS_SECRET_KEY=''
mvn -Dtest=ITestS3AContractVectoredRead -Dtest.fs.s3a.name="s3a://mybucket" 
-Dtest.fs.s3a.endpoint="http://localhost:9878/; clean test
{code}

{code:title=test results}
Tests run: 46, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 7.879 s <<< 
FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead
testMinSeekAndMaxSizeDefaultValues[Buffer type : 
direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 1.95 s  <<< ERROR!
java.lang.NullPointerException: No test bucket
  at org.apache.hadoop.util.Preconditions.checkNotNull(Preconditions.java:88)
  at 
org.apache.hadoop.fs.s3a.S3ATestUtils.getTestBucketName(S3ATestUtils.java:714)
  at 
org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides(S3ATestUtils.java:775)
  at 
org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead.testMinSeekAndMaxSizeDefaultValues(ITestS3AContractVectoredRead.java:104)
  ...

testMinSeekAndMaxSizeConfigsPropagation[Buffer type : 
direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.176 s  <<< ERROR!
testMultiVectoredReadStatsCollection[Buffer type : 
direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.179 s  <<< ERROR!
testNormalReadVsVectoredReadStatsCollection[Buffer type : 
direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.155 s  <<< ERROR!
testMinSeekAndMaxSizeDefaultValues[Buffer type : 
array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.116 s  <<< ERROR!
testMinSeekAndMaxSizeConfigsPropagation[Buffer type : 
array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.102 s  <<< ERROR!
testMultiVectoredReadStatsCollection[Buffer type : 
array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.105 s  <<< ERROR!
testNormalReadVsVectoredReadStatsCollection[Buffer type : 
array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
elapsed: 0.107 s  <<< ERROR!
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19041) further use of StandardCharsets

2024-03-28 Thread Dinesh Chitlangia (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia resolved HADOOP-19041.

Fix Version/s: 3.5.0
 Assignee: PJ Fanning
   Resolution: Fixed

Thanks for the contribution [~fanningpj] 

> further use of StandardCharsets
> ---
>
> Key: HADOOP-19041
> URL: https://issues.apache.org/jira/browse/HADOOP-19041
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> builds on HADOOP-18957



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19124) Update org.ehcache from 3.3.1 to 3.8.2.

2024-03-28 Thread Dinesh Chitlangia (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia resolved HADOOP-19124.

Fix Version/s: 3.5.0
   Resolution: Fixed

Thanks [~slfan1989] for the contribution.

> Update org.ehcache from 3.3.1 to 3.8.2.
> ---
>
> Key: HADOOP-19124
> URL: https://issues.apache.org/jira/browse/HADOOP-19124
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.4.1
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> We need to enhance the caching functionality in Yarn Federation by adding a 
> limit on the number of cached entries. I noticed that the version of 
> org.ehcache is relatively old and requires an upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19131) Assist reflection iO with WrappedOperations class

2024-03-28 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19131:
---

 Summary: Assist reflection iO with WrappedOperations class
 Key: HADOOP-19131
 URL: https://issues.apache.org/jira/browse/HADOOP-19131
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/azure, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


parquet, avro etc are still stuck building with older hadoop releases. 

This makes using new APIs hard (PARQUET-2117) and means that APIs which are 5 
years old (!) such as HADOOP-15229 just aren't picked up.

This lack of openFIle() adoption hurts working with files in cloud storage as
* extra HEAD requests are made
* read policies can't be explicitly set
* split start/end can't be passed down

Proposed
# create class org.apache.hadoop.io.WrappedOperations
# add methods to wrap the apis
# test in contract tests via reflection loading -verifies we have done it 
properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19130) FTPFileSystem rename with full qualified path broken

2024-03-26 Thread shawn (Jira)
shawn created HADOOP-19130:
--

 Summary: FTPFileSystem rename with full qualified path broken
 Key: HADOOP-19130
 URL: https://issues.apache.org/jira/browse/HADOOP-19130
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 3.3.6, 3.3.4, 3.3.3, 0.20.2
Reporter: shawn
 Attachments: image-2024-03-27-09-59-12-381.png

When use fs shell to rename file in ftp server, it always get "Input/output 
error", when full qualified path 
is passed to it(eg. ftp://user:password@localhost/pathxxx), the reason is that
changeWorkingDirectory command underneath is being passed a string with file:// 
uri prefix which will not be understand 
by ftp server。

!image-2024-03-27-09-59-12-381.png!

the solution should be pass absoluteSrc.getParent().toUri().getPath().toString 
to avoid
file:// uri prefix, like this: 
{code:java}
--- a/FTPFileSystem.java
+++ b/FTPFileSystem.java
@@ -549,15 +549,15 @@ public class FTPFileSystem extends FileSystem {
       throw new IOException("Destination path " + dst
           + " already exist, cannot rename!");
     }
-    String parentSrc = absoluteSrc.getParent().toUri().toString();
-    String parentDst = absoluteDst.getParent().toUri().toString();
+    URI parentSrc = absoluteSrc.getParent().toUri();
+    URI parentDst = absoluteDst.getParent().toUri();
     String from = src.getName();
     String to = dst.getName();
-    if (!parentSrc.equals(parentDst)) {
+    if (!parentSrc.toString().equals(parentDst.toString())) {
       throw new IOException("Cannot rename parent(source): " + parentSrc
           + ", parent(destination):  " + parentDst);
     }
-    client.changeWorkingDirectory(parentSrc);
+    client.changeWorkingDirectory(parentSrc.getPath().toString());
     boolean renamed = client.rename(from, to);
     return renamed;
   }{code}

already related issue  as follows 
https://issues.apache.org/jira/browse/HADOOP-8653

I wonder why this bug haven't be fixed

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits

2024-03-26 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19047.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> Support InMemory Tracking Of S3A Magic Commits
> --
>
> Key: HADOOP-19047
> URL: https://issues.apache.org/jira/browse/HADOOP-19047
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> The following are the operations which happens within a Task when it uses S3A 
> Magic Committer. 
> *During closing of stream*
> 1. A 0-byte file with a same name of the original file is uploaded to S3 
> using PUT operation. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152]
>  for more information. This is done so that the downstream application like 
> Spark could get the size of the file which is being written.
> 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176]
>  for more information.
> *During TaskCommit*
> 1. All the MPU metadata which the task wrote to S3 (There will be 'x' number 
> of metadata file in S3 if a single task writes to 'x' files) are read and 
> rewritten to S3 as a single metadata file. Refer 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201]
>  for more information
> Since these operations happens with the Task JVM, We could optimize as well 
> as save cost by storing these information in memory when Task memory usage is 
> not a constraint. Hence the proposal here is to introduce a new MagicCommit 
> Tracker called "InMemoryMagicCommitTracker" which will store the 
> 1. Metadata of MPU in memory till the Task is committed
> 2. Store the size of the file which can be used by the downstream application 
> to get the file size before it is committed/visible to the output path.
> This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call 
> given a Task writes only 1 file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19129) ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite

2024-03-26 Thread Anuj Modi (Jira)
Anuj Modi created HADOOP-19129:
--

 Summary: ABFS: Fixing Test Script Bug and Some Known test Failures 
in ABFS Test Suite
 Key: HADOOP-19129
 URL: https://issues.apache.org/jira/browse/HADOOP-19129
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.4.0, 3.4.1
Reporter: Anuj Modi






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19060) Support hadoop client authentication through keytab configuration.

2024-03-26 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HADOOP-19060.
--
Resolution: Won't Fix

> Support hadoop client authentication through keytab configuration.
> --
>
> Key: HADOOP-19060
> URL: https://issues.apache.org/jira/browse/HADOOP-19060
> Project: Hadoop Common
>  Issue Type: New Feature
>Reporter: Zhaobo Huang
>Assignee: Zhaobo Huang
>Priority: Minor
>  Labels: pull-request-available
>
> *Shield references to {{UserGroupInformation}} Class.*
> The current HDFS client keytab authentication code is as follows:
> {code:java}
> Configuration conf = new Configuration();
> conf.addResource(new 
> Path("/usr/local/service/hadoop/etc/hadoop/hdfs-site.xml"));
> conf.addResource(new 
> Path("/usr/local/service/hadoop/etc/hadoop/core-site.xml"));
> UserGroupInformation.setConfiguration(conf);
> UserGroupInformation.loginUserFromKeytab("foo", "/var/krb5kdc/foo.keytab");
> FileSystem fileSystem = FileSystem.get(conf);
> FileStatus[] fileStatus = fileSystem.listStatus(new Path("/"));
> for (FileStatus status : fileStatus) {
> System.out.println(status.getPath());
> } {code}
> This feature supports configuring keytab information in core-site.xml or hdfs 
> site.xml. The authentication code is as follows:
> {code:java}
> Configuration conf = new Configuration();
> conf.addResource(new 
> Path("/usr/local/service/hadoop/etc/hadoop/hdfs-site.xml"));
> conf.addResource(new 
> Path("/usr/local/service/hadoop/etc/hadoop/core-site.xml"));
> FileSystem fileSystem = FileSystem.get(conf);
> FileStatus[] fileStatus = fileSystem.listStatus(new Path("/"));
> for (FileStatus status : fileStatus) {
> System.out.println(status.getPath());
> } {code}
> The config of core-site.xml related to authentication is as follows:
> {code:java}
> 
> 
> hadoop.security.authentication
> kerberos
> 
> 
> hadoop.client.keytab.principal
> foo
> 
> 
> hadoop.client.keytab.file.path
> /var/krb5kdc/foo.keytab
> 
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19088) upgrade to jersey-json 1.22.0

2024-03-25 Thread Dinesh Chitlangia (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Chitlangia resolved HADOOP-19088.

Resolution: Fixed

> upgrade to jersey-json 1.22.0
> -
>
> Key: HADOOP-19088
> URL: https://issues.apache.org/jira/browse/HADOOP-19088
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Tidies up support for Jettison and Jackson versions used by Hadoop



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >