[jira] [Resolved] (HADOOP-19168) Upgrade Kafka Clients due to CVEs

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19168.
-
Resolution: Duplicate

rohit, dupe of HADOOP-18962. let's focus on that

> Upgrade Kafka Clients due to CVEs
> -
>
> Key: HADOOP-19168
> URL: https://issues.apache.org/jira/browse/HADOOP-19168
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Rohit Kumar
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade Kafka Clients due to CVEs
> CVE-2023-25194:- Affected versions of this package are vulnerable to 
> Deserialization of Untrusted Data when there are gadgets in the 
> {{{}classpath{}}}. The server will connect to the attacker's LDAP server and 
> deserialize the LDAP response, which the attacker can use to execute java 
> deserialization gadget chains on the Kafka connect server.
> CVSS Score:- 8.8(High)
> [https://nvd.nist.gov/vuln/detail/CVE-2023-25194] 
> CVE-2021-38153
> CVE-2018-17196
> Insufficient Entropy
> [https://security.snyk.io/package/maven/org.apache.kafka:kafka-clients] 
> Upgrade Kafka-Clients to 3.4.0 or higher.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19182) Upgrade kafka to 3.4.0

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19182:

Affects Version/s: 3.4.0

> Upgrade kafka to 3.4.0
> --
>
> Key: HADOOP-19182
> URL: https://issues.apache.org/jira/browse/HADOOP-19182
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.4.0
>Reporter: fuchaohong
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade kafka to 3.4.0 to resolve CVE-2023-25194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19182) Upgrade kafka to 3.4.0

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19182:

Issue Type: Improvement  (was: Bug)

> Upgrade kafka to 3.4.0
> --
>
> Key: HADOOP-19182
> URL: https://issues.apache.org/jira/browse/HADOOP-19182
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: fuchaohong
>Priority: Major
>  Labels: pull-request-available
>
> Upgrade kafka to 3.4.0 to resolve CVE-2023-25194



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19185) Improve ABFS metric integration with iOStatistics

2024-05-23 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19185:
---

 Summary: Improve ABFS metric integration with iOStatistics
 Key: HADOOP-19185
 URL: https://issues.apache.org/jira/browse/HADOOP-19185
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Reporter: Steve Loughran


Followup to HADOOP-18325 covering the outstanding comments of

https://github.com/apache/hadoop/pull/6314/files





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18325) ABFS: Add correlated metric support for ABFS operations

2024-05-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18325.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> ABFS: Add correlated metric support for ABFS operations
> ---
>
> Key: HADOOP-18325
> URL: https://issues.apache.org/jira/browse/HADOOP-18325
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.3
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Add metrics related to a particular job, specific to number of total 
> requests, retried requests, retry count and others



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-19178) WASB Driver Deprecation and eventual removal

2024-05-22 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848360#comment-17848360
 ] 

Steve Loughran edited comment on HADOOP-19178 at 5/22/24 2:17 PM:
--

makes sense, I've been undertesting it anyway.

It'd be good for any form of graceful degradation of wasb driver

* immediate PR to warn its deprecated for trunk, 3.4 and 3.3.9 branches, docs 
updated
* after the cut, have some stub fs to fail on instantiate() with meaningful 
error message. We did this with s3n, way back. No attempt at migration, just a 
"gone, go look at at the docs"


was (Author: ste...@apache.org):
makes sense, I've been undertesting it anyway.

It'd be good for any form of graceful degradation of wasb driver

* immediate PR to warn its deprecated for trunk, 3.4 and 3.3.9 branches, docs 
updated
* after the cut, have some stub fs to fail on instantiate() with meaningful 
error message. We did this with s3n, way back. No attempt at migration, just a 
"done, go look at at the docs"

> WASB Driver Deprecation and eventual removal
> 
>
> Key: HADOOP-19178
> URL: https://issues.apache.org/jira/browse/HADOOP-19178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Sneha Vijayarajan
>Assignee: Sneha Vijayarajan
>Priority: Major
> Fix For: 3.4.1
>
>
> *WASB Driver*
> WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
> accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder 
> operations hence are mimicked at client side by WASB driver and certain 
> folder operations like Rename and Delete can lead to lot of IOPs with 
> client-side enumeration and orchestration of rename/delete operation blob by 
> blob. It was not ideal for other APIs too as initial checks for path is a 
> file or folder needs to be done over multiple metadata calls. These led to a 
> degraded performance.
> To provide better service to Analytics customers, Microsoft released ADLS 
> Gen2 which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. 
> ABFS driver was designed to overcome the inherent deficiencies of WASB and 
> customers were informed to migrate to ABFS driver.
> *Customers who still use the legacy WASB driver and the challenges they face* 
> Some of our customers have not migrated to the ABFS driver yet and continue 
> to use the legacy WASB driver with FNS accounts.  
> These customers face the following challenges: 
>  * They cannot leverage the optimizations and benefits of the ABFS driver.
>  * They need to deal with the compatibility issues should the files and 
> folders were modified with the legacy WASB driver and the ABFS driver 
> concurrently in a phased transition situation.
>  * There are differences for supported features for FNS and HNS over ABFS 
> Driver
>  * In certain cases, they must perform a significant amount of re-work on 
> their workloads to migrate to the ABFS driver, which is available only on HNS 
> enabled accounts in a fully tested and supported scenario.
> *Deprecation plans for WASB*
> We are introducing a new feature that will enable the ABFS driver to support 
> FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
> enable customers to use the ABFS driver to interact with data stored in GPv2 
> (General Purpose v2) storage accounts. 
> With this feature, the customers who still use the legacy WASB driver will be 
> able to migrate to the ABFS driver without much re-work on their workloads. 
> They will however need to change the URIs from the WASB scheme to the ABFS 
> scheme. 
> Once ABFS driver has built FNS support capability to migrate WASB customers, 
> WASB driver will be declared deprecated in OSS documentation and marked for 
> removal in next major release. This will remove any ambiguity for new 
> customer onboards as there will be only one Microsoft driver for Azure 
> Storage and migrating customers will get SLA bound support for driver and 
> service, which was not guaranteed over WASB.
>  We anticipate that this feature will serve as a stepping stone for customers 
> to move to HNS enabled accounts with the ABFS driver, which is our 
> recommended stack for big data analytics on ADLS Gen2. 
> *Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
> account) with ABFS driver* *?*
> This feature does not impact the existing customers who are using ADLS Gen2 
> (HNS enabled account) with ABFS driver.
> They do not need to make any changes to their workloads or configurations. 
> They will still enjoy the benefits of HNS, such as atomic operations, 
> fine-grained access control, scalability, and performance. 
> *Official recommendation*
> Microsoft continues to recommend all 

[jira] [Updated] (HADOOP-19177) TestS3ACachingBlockManager fails intermittently in Yetus

2024-05-21 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19177:

Parent: HADOOP-18028
Issue Type: Sub-task  (was: Test)

> TestS3ACachingBlockManager fails intermittently in Yetus
> 
>
> Key: HADOOP-19177
> URL: https://issues.apache.org/jira/browse/HADOOP-19177
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Priority: Major
>
> {code:java}
> [ERROR] 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet 
> -- Time elapsed: 60.45 s <<< ERROR!
> java.lang.IllegalStateException: waitForCaching: expected: 1, actual: 0, read 
> errors: 0, caching errors: 1
>   at 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.waitForCaching(TestS3ACachingBlockManager.java:465)
>   at 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGetHelper(TestS3ACachingBlockManager.java:435)
>   at 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet(TestS3ACachingBlockManager.java:398)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:750)
> [INFO] 
> [INFO] Results:
> [INFO] 
> [ERROR] Errors: 
> [ERROR] 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingFailureOfGet
> [ERROR]   Run 1: 
> TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
>  IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
> errors: 1
> [ERROR]   Run 2: 
> TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
>  IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
> errors: 1
> [ERROR]   Run 3: 
> TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
>  IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
> errors: 1 {code}
> Discovered in 
> [https://github.com/apache/hadoop/pull/6646#issuecomment-2111558054] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19177) TestS3ACachingBlockManager fails intermittently in Yetus

2024-05-21 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848364#comment-17848364
 ] 

Steve Loughran commented on HADOOP-19177:
-

before going near this I want HADOOP-18184 in, as its a big update. It's got 
some intermittent/recurrent failures too, but i'm not sure they are new. And 
I've been working on it intermittently enough that I keep forgetting where I 
was.

* I think somehow the cache/prefetch doesn't work reliably.
* after getting my patch in I want to pull vector IO out of s3InputStream and 
share across both impls

> TestS3ACachingBlockManager fails intermittently in Yetus
> 
>
> Key: HADOOP-19177
> URL: https://issues.apache.org/jira/browse/HADOOP-19177
> Project: Hadoop Common
>  Issue Type: Test
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Priority: Major
>
> {code:java}
> [ERROR] 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet 
> -- Time elapsed: 60.45 s <<< ERROR!
> java.lang.IllegalStateException: waitForCaching: expected: 1, actual: 0, read 
> errors: 0, caching errors: 1
>   at 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.waitForCaching(TestS3ACachingBlockManager.java:465)
>   at 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGetHelper(TestS3ACachingBlockManager.java:435)
>   at 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet(TestS3ACachingBlockManager.java:398)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:750)
> [INFO] 
> [INFO] Results:
> [INFO] 
> [ERROR] Errors: 
> [ERROR] 
> org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingFailureOfGet
> [ERROR]   Run 1: 
> TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
>  IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
> errors: 1
> [ERROR]   Run 2: 
> TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
>  IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
> errors: 1
> [ERROR]   Run 3: 
> TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465
>  IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching 
> errors: 1 {code}
> Discovered in 
> [https://github.com/apache/hadoop/pull/6646#issuecomment-2111558054] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19178) WASB Driver Deprecation and eventual removal

2024-05-21 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848360#comment-17848360
 ] 

Steve Loughran commented on HADOOP-19178:
-

makes sense, I've been undertesting it anyway.

It'd be good for any form of graceful degradation of wasb driver

* immediate PR to warn its deprecated for trunk, 3.4 and 3.3.9 branches, docs 
updated
* after the cut, have some stub fs to fail on instantiate() with meaningful 
error message. We did this with s3n, way back. No attempt at migration, just a 
"done, go look at at the docs"

> WASB Driver Deprecation and eventual removal
> 
>
> Key: HADOOP-19178
> URL: https://issues.apache.org/jira/browse/HADOOP-19178
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Sneha Vijayarajan
>Assignee: Sneha Vijayarajan
>Priority: Major
> Fix For: 3.4.1
>
>
> *WASB Driver*
> WASB driver was developed to support FNS (FlatNameSpace) Azure Storage 
> accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder 
> operations hence are mimicked at client side by WASB driver and certain 
> folder operations like Rename and Delete can lead to lot of IOPs with 
> client-side enumeration and orchestration of rename/delete operation blob by 
> blob. It was not ideal for other APIs too as initial checks for path is a 
> file or folder needs to be done over multiple metadata calls. These led to a 
> degraded performance.
> To provide better service to Analytics customers, Microsoft released ADLS 
> Gen2 which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. 
> ABFS driver was designed to overcome the inherent deficiencies of WASB and 
> customers were informed to migrate to ABFS driver.
> *Customers who still use the legacy WASB driver and the challenges they face* 
> Some of our customers have not migrated to the ABFS driver yet and continue 
> to use the legacy WASB driver with FNS accounts.  
> These customers face the following challenges: 
>  * They cannot leverage the optimizations and benefits of the ABFS driver.
>  * They need to deal with the compatibility issues should the files and 
> folders were modified with the legacy WASB driver and the ABFS driver 
> concurrently in a phased transition situation.
>  * There are differences for supported features for FNS and HNS over ABFS 
> Driver
>  * In certain cases, they must perform a significant amount of re-work on 
> their workloads to migrate to the ABFS driver, which is available only on HNS 
> enabled accounts in a fully tested and supported scenario.
> *Deprecation plans for WASB*
> We are introducing a new feature that will enable the ABFS driver to support 
> FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will 
> enable customers to use the ABFS driver to interact with data stored in GPv2 
> (General Purpose v2) storage accounts. 
> With this feature, the customers who still use the legacy WASB driver will be 
> able to migrate to the ABFS driver without much re-work on their workloads. 
> They will however need to change the URIs from the WASB scheme to the ABFS 
> scheme. 
> Once ABFS driver has built FNS support capability to migrate WASB customers, 
> WASB driver will be declared deprecated in OSS documentation and marked for 
> removal in next major release. This will remove any ambiguity for new 
> customer onboards as there will be only one Microsoft driver for Azure 
> Storage and migrating customers will get SLA bound support for driver and 
> service, which was not guaranteed over WASB.
>  We anticipate that this feature will serve as a stepping stone for customers 
> to move to HNS enabled accounts with the ABFS driver, which is our 
> recommended stack for big data analytics on ADLS Gen2. 
> *Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled 
> account) with ABFS driver* *?*
> This feature does not impact the existing customers who are using ADLS Gen2 
> (HNS enabled account) with ABFS driver.
> They do not need to make any changes to their workloads or configurations. 
> They will still enjoy the benefits of HNS, such as atomic operations, 
> fine-grained access control, scalability, and performance. 
> *Official recommendation*
> Microsoft continues to recommend all Big Data and Analytics customers to use 
> Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to 
> optimize this scenario in future, we believe that this new option will help 
> all those customers to transition to a supported scenario immediately, while 
> they plan to ultimately move to ADLS Gen2 (HNS enabled account).
>  *New Authentication options that a WASB to ABFS Driver migrating customer 
> will get*
> Below auth types that WASB provides 

[jira] [Resolved] (HADOOP-19163) Upgrade protobuf version to 3.25.3

2024-05-21 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19163.
-
Resolution: Fixed

done. not sure what version to tag with.

Proposed: we cut a new release of this

> Upgrade protobuf version to 3.25.3
> --
>
> Key: HADOOP-19163
> URL: https://issues.apache.org/jira/browse/HADOOP-19163
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hadoop-thirdparty
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19181) IAMCredentialsProvider throttle failures

2024-05-21 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17848336#comment-17848336
 ] 

Steve Loughran commented on HADOOP-19181:
-


Spent some time looking into the AWS SDK with Harshit Gupta and Mukund Thakur

h2. AWS API docs

AWS docs says callers should retry with backoff on throttling. But; it doesn't 
say what error code. Assume 503 for consistency with other services (s3): 
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instancedata-data-retrieval.html#instancedata-throttling

h2. v1 SDK Credential collection

Looking at v1 sdk com.amazonaws.auth.BaseCredentialsFetcher

* will probe for credentials whenever its been 10 minutes since last check.
* or when clock has passed expiry time
* refresh before expiry time is 15 minutes before expory
* credential retrieval will long and continue if existing credentials exist, 
even
  if they have expired (no retry)


h2. V2 SDK
* There is no attempt to retry on a GET of credentials from EC2 instances 
(InstanceProfileCredentialsProvider)
* There is a retry policy for container credentials; the GET is retried 5 times 
with no delay on any 5xx error.

When does prefetch take place?

{code}
private Instant prefetchTime(Instant expiration) {
Instant now = clock.instant();

if (expiration == null) {
return now.plus(60, MINUTES);
}

Duration timeUntilExpiration = Duration.between(now, expiration);
if (timeUntilExpiration.isNegative()) {
// IMDS gave us a time in the past. We're already stale. Don't prefetch.
return null;
}

return now.plus(maximum(timeUntilExpiration.dividedBy(2), 
Duration.ofMinutes(5)));
}
{code}

If you get credentials and the expiry time is under 5 minutes, prefetching will 
not take place.
No worker processes launched a few minutes before session credential expiry 
will have any refresh until the credentials are consistered stale.

When are credentials considered stale?

{code}
return expiration.minusSeconds(1);
{code}

so there's only 1s for a blocking fetch. If there is any clock drift *or jvm 
pause*

And if that request fails

{code}
Instant newStaleTime = jitterTime(now, Duration.ofMillis(1), 
maxStaleFailureJitter(numFailures));
log.warn(() -> "(" + cachedValueName + ") Cached value expiration has been 
extended to " +
   newStaleTime + " because calling the downstream service failed 
(consecutive failures: " +
   numFailures + ").", e);

return currentCachedValue.toBuilder()
 .staleTime(newStaleTime)
 .build();

{code}

There is no jitter enabled in the prefetch, only in that retrieval of stale 
credentials.

And that can be up to 10s, even though the credentials expire in 1s. 

{code}
private Duration maxStaleFailureJitter(int numFailures) {
long exponentialBackoffMillis = (1L << numFailures - 1) * 100;
return ComparableUtils.minimum(Duration.ofMillis(exponentialBackoffMillis), 
Duration.ofSeconds(10));
}
{code}

A single failure of the GET for any reason is going to return credentials that 
are inevitably out of date.

h3. ContainerCredentialsProvider

This class does choose a different retry policy, retaining that 15 minute 
policy.
{code}
private Instant prefetchTime(Instant expiration) {
Instant oneHourFromNow = Instant.now().plus(1, ChronoUnit.HOURS);

if (expiration == null) {
return oneHourFromNow;
}

Instant fifteenMinutesBeforeExpiration = expiration.minus(15, 
ChronoUnit.MINUTES);

return ComparableUtils.minimum(oneHourFromNow, 
fifteenMinutesBeforeExpiration);
}
{code}

It also has a retry poicy on failure

{code}
private static final int MAX_RETRIES = 5;

@Override
public boolean shouldRetry(int retriesAttempted, 
ResourcesEndpointRetryParameters retryParams) {
if (retriesAttempted >= MAX_RETRIES) {
return false;
}

Integer statusCode = retryParams.getStatusCode();
if (statusCode != null && HttpStatusFamily.of(statusCode) == 
HttpStatusFamily.SERVER_ERROR) {
return true;
}

return retryParams.getException() instanceof IOException;
}
{code}

The retry policy means there is a brief attempt at recovery, without the cache 
jitter
logic getting involved.

This probably makes it more resilient to failures, though if there are load 
problems,
the sequence of 5 GET requests will not help.

Hypothesised failure conditions.

* If many processes are launched so close together that they are prefetching at 
about the same time. And as the credentials on the same server expires at 
exactly the same time for all processes, if the prefetch hasn't taken place 
then it will happen when credentials are considered stale.
* Or multiple s3a clients to different filesystems on same process.
* This happens with < 1s to go, so brittle to clock, process swap, jvm gc etc.

Changes to suggest for SDK

* 

[jira] [Commented] (HADOOP-18990) S3A: retry on credential expiry

2024-05-20 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847848#comment-17847848
 ] 

Steve Loughran commented on HADOOP-18990:
-

noticed that an AWS product (greegrass) has implemented their recovery for this 
through special handling of 400 + error text scan. ugly, but clearly what we 
will have to consider too.



> S3A: retry on credential expiry
> ---
>
> Key: HADOOP-18990
> URL: https://issues.apache.org/jira/browse/HADOOP-18990
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> Reported in AWS SDK https://github.com/aws/aws-sdk-java-v2/issues/3408
> bq. In RetryableStage execute method, the "AwsCredentails" does not attempt 
> to renew if it has expired. Therefore, if a method called with the existing 
> credential is expiring soon, the number of retry is less than intended due to 
> the expiration of the credential.
> The stack from this report doesn't show any error detail we can use to 
> identify the 400 exception as something we should be retrying on. This could 
> be due to the logging, or it could actually hold. we've have to generate some 
> socket credentials, let them expire and then see how hadoop fs commands 
> failed. Something to do by hand as an STS test to do this is probably slow. 
> *unless we expire all session credentials of a given role?*. Could be good, 
> would be traumatic for other test runs though.
> {code}
> software.amazon.awssdk.services.s3.model.S3Exception: The provided token has 
> expired. (Service: S3, Status Code: 400, Request ID: 3YWKVBNJPNTXPJX2, 
> Extended Request ID: 
> GkR56xA0r/Ek7zqQdB2ZdP3wqMMhf49HH7hc5N2TAIu47J3HEk6yvSgVNbX7ADuHDy/Irhr2rPQ=)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19181) IAMCredentialsProvider throttle failures

2024-05-20 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17847840#comment-17847840
 ] 

Steve Loughran commented on HADOOP-19181:
-

reviewing AWS SDK JIRA, there's still a shorter timeout of 1s on IAM requests. 

Can't help wondering if we should think about doing the async fetch stuff 
ourselves so can collect stats on what's happening etc. Don't really want to 
though...

> IAMCredentialsProvider throttle failures
> 
>
> Key: HADOOP-19181
> URL: https://issues.apache.org/jira/browse/HADOOP-19181
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> Tests report throttling errors in IAM being remapped to noauth and failure
> Again, impala tests, but with multiple processes on same host. this means 
> that HADOOP-18945 isn't sufficient as even if it ensures a singleton instance 
> for a process
> * it doesn't if there are many test buckets (fixable)
> * it doesn't work across processes (not fixable)
> we may be able to 
> * use a singleton across all filesystem instances
> * once we know how throttling is reported, handle it through retries + 
> error/stats collection



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19181) IAMCredentialsProvider throttle failures

2024-05-20 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19181:

Description: 
Tests report throttling errors in IAM being remapped to noauth and failure

Again, impala tests, but with multiple processes on same host. this means that 
HADOOP-18945 isn't sufficient as even if it ensures a singleton instance for a 
process
* it doesn't if there are many test buckets (fixable)
* it doesn't work across processes (not fixable)

we may be able to 
* use a singleton across all filesystem instances
* once we know how throttling is reported, handle it through retries + 
error/stats collection




  was:
Tests report throttling errors in IAM being remapped to noauth and failure

Again, impala tests, but with multiple processes on same host. this means that 
HADOOP-18945 isn't sufficient as even if it ensures a singleton instance for a 
process
* it doesn't if there are many test buckets (fixable)
* it doesn't work across processes (not fixable)

we may be able to 
* use a singleton across all filesystem instances
* once we know how throttling is reported, handle it through retries + 
error/stats collection


{code}
2024-02-17T18:02:10,175  WARN [TThreadPoolServer WorkerProcess-22] 
fs.FileSystem: Failed to initialize fileystem 
s3a://impala-test-uswest2-1/test-warehouse/test_num_values_def_levels_mismatch_15b31ddb.db/too_many_def_levels:
 java.nio.file.AccessDeniedException: impala-test-uswest2-1: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
2024-02-17T18:02:10,175 ERROR [TThreadPoolServer WorkerProcess-22] 
utils.MetaStoreUtils: Got exception: java.nio.file.AccessDeniedException 
impala-test-uswest2-1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No 
AWS Credentials provided by TemporaryAWSCredentialsProvider 
SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
java.nio.file.AccessDeniedException: impala-test-uswest2-1: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.maybeTranslateCredentialException(AWSCredentialProviderList.java:351)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:201) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$2(S3AFileSystem.java:972)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2748)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:970)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.doBucketProbing(S3AFileSystem.java:859) 

[jira] [Created] (HADOOP-19181) IAMCredentialsProvider throttle failures

2024-05-20 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19181:
---

 Summary: IAMCredentialsProvider throttle failures
 Key: HADOOP-19181
 URL: https://issues.apache.org/jira/browse/HADOOP-19181
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


Tests report throttling errors in IAM being remapped to noauth and failure

Again, impala tests, but with multiple processes on same host. this means that 
HADOOP-18945 isn't sufficient as even if it ensures a singleton instance for a 
process
* it doesn't if there are many test buckets (fixable)
* it doesn't work across processes (not fixable)

we may be able to 
* use a singleton across all filesystem instances
* once we know how throttling is reported, handle it through retries + 
error/stats collection


{code}
2024-02-17T18:02:10,175  WARN [TThreadPoolServer WorkerProcess-22] 
fs.FileSystem: Failed to initialize fileystem 
s3a://impala-test-uswest2-1/test-warehouse/test_num_values_def_levels_mismatch_15b31ddb.db/too_many_def_levels:
 java.nio.file.AccessDeniedException: impala-test-uswest2-1: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
2024-02-17T18:02:10,175 ERROR [TThreadPoolServer WorkerProcess-22] 
utils.MetaStoreUtils: Got exception: java.nio.file.AccessDeniedException 
impala-test-uswest2-1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No 
AWS Credentials provided by TemporaryAWSCredentialsProvider 
SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
java.nio.file.AccessDeniedException: impala-test-uswest2-1: 
org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
software.amazon.awssdk.core.exception.SdkClientException: Unable to load 
credentials from system settings. Access key must be specified either via 
environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId).
at 
org.apache.hadoop.fs.s3a.AWSCredentialProviderList.maybeTranslateCredentialException(AWSCredentialProviderList.java:351)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:201) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$2(S3AFileSystem.java:972)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445)
 ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2748)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:970)
 ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.doBucketProbing(S3AFileSystem.java:859) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:715) 
~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?]
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3452) 
~[hadoop-common-3.1.1.7.2.18.0-620.jar:?]
at 

[jira] [Updated] (HADOOP-18722) Optimise S3A delete objects when multiObjectDelete is disabled

2024-05-17 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18722:

Affects Version/s: 3.3.6

> Optimise S3A delete objects when multiObjectDelete is disabled
> --
>
> Key: HADOOP-18722
> URL: https://issues.apache.org/jira/browse/HADOOP-18722
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Mehakmeet Singh
>Assignee: Mehakmeet Singh
>Priority: Major
>
> Currently, for doing a bulk delete in S3A, we rely on multiObjectDelete call, 
> but when this property is disabled we delete one key at a time. We can 
> optimize this scenario by adding parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18679) Add API for bulk/paged delete of files and objects

2024-05-17 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18679:

Summary: Add API for bulk/paged delete of files and objects  (was: Add API 
for bulk/paged object deletion)

> Add API for bulk/paged delete of files and objects
> --
>
> Key: HADOOP-18679
> URL: https://issues.apache.org/jira/browse/HADOOP-18679
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> iceberg and hbase could benefit from being able to give a list of individual 
> files to delete -files which may be scattered round the bucket for better 
> read peformance. 
> Add some new optional interface for an object store which allows a caller to 
> submit a list of paths to files to delete, where
> the expectation is
> * if a path is a file: delete
> * if a path is a dir, outcome undefined
> For s3 that'd let us build these into DeleteRequest objects, and submit, 
> without any probes first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720

2024-05-16 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19172.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
   Resolution: Fixed

> Upgrade aws-java-sdk to 1.12.720
> 
>
> Key: HADOOP-19172
> URL: https://issues.apache.org/jira/browse/HADOOP-19172
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Update to the latest AWS SDK, to stop anyone worrying about the ion library 
> CVE https://nvd.nist.gov/vuln/detail/CVE-2024-21634
> This isn't exposed in the s3a client, but may be used downstream. 
> on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19073) WASB: Fix connection leak in FolderRenamePending

2024-05-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19073.
-
Resolution: Fixed

> WASB: Fix connection leak in FolderRenamePending
> 
>
> Key: HADOOP-19073
> URL: https://issues.apache.org/jira/browse/HADOOP-19073
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.3.6
>Reporter: xy
>Assignee: xy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Fix connection leak in FolderRenamePending in getting bytes  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19073) WASB: Fix connection leak in FolderRenamePending

2024-05-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-19073:
---

Assignee: xy

> WASB: Fix connection leak in FolderRenamePending
> 
>
> Key: HADOOP-19073
> URL: https://issues.apache.org/jira/browse/HADOOP-19073
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.3.6
>Reporter: xy
>Assignee: xy
>Priority: Major
>  Labels: pull-request-available
>
> Fix connection leak in FolderRenamePending in getting bytes  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19073) WASB: Fix connection leak in FolderRenamePending

2024-05-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19073:

Fix Version/s: 3.5.0

> WASB: Fix connection leak in FolderRenamePending
> 
>
> Key: HADOOP-19073
> URL: https://issues.apache.org/jira/browse/HADOOP-19073
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Affects Versions: 3.3.6
>Reporter: xy
>Assignee: xy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Fix connection leak in FolderRenamePending in getting bytes  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19170) Fixes compilation issues on Mac

2024-05-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19170:

Fix Version/s: 3.4.1

> Fixes compilation issues on Mac
> ---
>
> Key: HADOOP-19170
> URL: https://issues.apache.org/jira/browse/HADOOP-19170
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: OS:  macOS Catalina 10.15.7
> compiler: clang 12.0.0
> cmake: 3.24.0
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> When I build hadoop-common native in Mac OS, I found this error:
> {code:java}
> /x/hadoop/hadoop-common-project/hadoop-common/src/main/native/src/exception.c:114:50:
>  error: function-like macro '__GLIBC_PREREQ' is not defined
> #if defined(__sun) || defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2, 32) {code}
> The reason is that Mac OS does not support glibc. And C conditional 
> compilation requires validation of all expressions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19176) S3A Xattr headers need hdfs-compatible prefix

2024-05-15 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19176:
---

 Summary: S3A Xattr headers need hdfs-compatible prefix
 Key: HADOOP-19176
 URL: https://issues.apache.org/jira/browse/HADOOP-19176
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.6, 3.4.0
Reporter: Steve Loughran


x3a xattr list needs a prefix compatible with hdfs or existing code which tries 
to copy attributes between stores can break

we need a prefix of {user/trusted/security/system/raw}.

now, problem: currently xattrs are used by the magic committer to propagate 
file size progress; renaming the prefix will break existing code. But as it's 
read only we could modify spark to look for both old and new values.

{code}

org.apache.hadoop.HadoopIllegalArgumentException: An XAttr name must be 
prefixed with user/trusted/security/system/raw, followed by a '.'
at org.apache.hadoop.hdfs.XAttrHelper.buildXAttr(XAttrHelper.java:77) 
at org.apache.hadoop.hdfs.DFSClient.setXAttr(DFSClient.java:2835) 
at 
org.apache.hadoop.hdfs.DistributedFileSystem$59.doCall(DistributedFileSystem.java:3106)
 
at 
org.apache.hadoop.hdfs.DistributedFileSystem$59.doCall(DistributedFileSystem.java:3102)
 
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.setXAttr(DistributedFileSystem.java:3115)
 
at org.apache.hadoop.fs.FileSystem.setXAttr(FileSystem.java:3097)

{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18786:
---

Assignee: Christopher Tubbs

> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Assignee: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18786) Hadoop build depends on archives.apache.org

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18786:

Fix Version/s: 3.5.0

> Hadoop build depends on archives.apache.org
> ---
>
> Key: HADOOP-18786
> URL: https://issues.apache.org/jira/browse/HADOOP-18786
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Christopher Tubbs
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Several times throughout Hadoop's source, the ASF archive is referenced, 
> including part of the build that downloads Yetus.
> Building a release from source should not require access to the ASF archives, 
> as that contributes to end users being subject to throttling and blocking by 
> INFRA, for "abuse" of the archives, even though they are merely building a 
> current ASF release from source. This is particularly problematic for 
> downstream packagers who must build from Hadoop's source, or for CI/CD 
> situations that depend on Hadoop's source, and particularly problematic for 
> those end users behind a NAT gateway, because even if Hadoop's use of the 
> archive is modest, it adds up for multiple users.
> The build should be modified, so that it does not require access to fixed 
> versions in the archives (or should work with the upstream of those dependent 
> projects to publish their releases elsewhere, for routine consumptions). In 
> the interim, the source could be updated to point to the current dependency 
> versions available on downloads.apache.org.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18958) Improve UserGroupInformation debug log

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18958:
---

Assignee: wangzhihui

>  Improve UserGroupInformation debug log
> ---
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Assignee: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18958) Improve UserGroupInformation debug log

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18958.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

>  Improve UserGroupInformation debug log
> ---
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Assignee: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18958) Improve UserGroupInformation debug log

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18958:

Summary:  Improve UserGroupInformation debug log  (was: 
UserGroupInformation debug log improve)

>  Improve UserGroupInformation debug log
> ---
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-18958) UserGroupInformation debug log improve

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reopened HADOOP-18958:
-

> UserGroupInformation debug log improve
> --
>
> Key: HADOOP-18958
> URL: https://issues.apache.org/jira/browse/HADOOP-18958
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.3.0, 3.3.5
>Reporter: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
> Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, 
> 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, 
> image-2023-10-30-14-35-11-161.png
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
>       Using “new Exception( )” to print the call stack of "doAs Method " in 
> the UserGroupInformation class. Using this way will print meaningless 
> Exception information and too many call stacks, This is not conducive to 
> troubleshooting
> *example:*
> !20231029-122825.jpeg|width=991,height=548!
>  
> *improved result* :
>  
> !image-2023-10-29-09-47-56-489.png|width=1099,height=156!
> !20231030-143525.jpeg|width=572,height=674!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19171) S3A: handle alternative forms of connection failure

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19171:

Description: 
We've had reports of network connection failures surfacing deeper in the stack 
where we don't convert to AWSApiCallTimeoutException so they aren't retried 
properly (retire connection and repeat)


{code}
Unable to execute HTTP request: Broken pipe (Write failed)
{code}


{code}
 Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed. (Service: Amazon S3; 
Status Code: 400; Error Code: RequestTimeout
{code}

note, this is v1 sdk but the 400 error is treated as fail-fast in all our 
versions and I don't think we do the same for the broken pipe. that one is 
going to be trickier to handle as unless that is coming from the http/tls 
libraries "broken pipe" may not be in the newer builds. We'd have to look for 
the string in the SDKs to see what causes it and go from there



  was:
We've had reports of network connection failures surfacing deeper in the stack 
where we don't convert to AWSApiCallTimeoutException so they aren't retried 
properly (retire connection and repeat)


{code}
Unable to execute HTTP request: Broken pipe (Write failed)
{code}


{code}
 Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed. (Service: Amazon S3; 
Status Code: 400; Error Code: RequestTimeout
{code}

note, this is v1 sdk but the 400 error is treated as fail-fast in all our 
versoins




> S3A: handle alternative forms of connection failure
> ---
>
> Key: HADOOP-19171
> URL: https://issues.apache.org/jira/browse/HADOOP-19171
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> We've had reports of network connection failures surfacing deeper in the 
> stack where we don't convert to AWSApiCallTimeoutException so they aren't 
> retried properly (retire connection and repeat)
> {code}
> Unable to execute HTTP request: Broken pipe (Write failed)
> {code}
> {code}
>  Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout
> {code}
> note, this is v1 sdk but the 400 error is treated as fail-fast in all our 
> versions and I don't think we do the same for the broken pipe. that one is 
> going to be trickier to handle as unless that is coming from the http/tls 
> libraries "broken pipe" may not be in the newer builds. We'd have to look for 
> the string in the SDKs to see what causes it and go from there



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19171) S3A: handle alternative forms of connection failure

2024-05-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19171:

Description: 
We've had reports of network connection failures surfacing deeper in the stack 
where we don't convert to AWSApiCallTimeoutException so they aren't retried 
properly (retire connection and repeat)


{code}
Unable to execute HTTP request: Broken pipe (Write failed)
{code}


{code}
 Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed. (Service: Amazon S3; 
Status Code: 400; Error Code: RequestTimeout
{code}

note, this is v1 sdk but the 400 error is treated as fail-fast in all our 
versoins



  was:
We've had reports of network connection failures surfacing deeper in the stack 
where we don't convert to AWSApiCallTimeoutException so they aren't retried 
properly (retire connection and repeat)


{code}
Unable to execute HTTP request: Broken pipe (Write failed)
{code}


{code}
 Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed. (Service: Amazon S3; 
Status Code: 400; Error Code: RequestTimeout
{code}




> S3A: handle alternative forms of connection failure
> ---
>
> Key: HADOOP-19171
> URL: https://issues.apache.org/jira/browse/HADOOP-19171
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> We've had reports of network connection failures surfacing deeper in the 
> stack where we don't convert to AWSApiCallTimeoutException so they aren't 
> retried properly (retire connection and repeat)
> {code}
> Unable to execute HTTP request: Broken pipe (Write failed)
> {code}
> {code}
>  Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout
> {code}
> note, this is v1 sdk but the 400 error is treated as fail-fast in all our 
> versoins



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19175) update s3a committer docs

2024-05-14 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19175:
---

 Summary: update s3a committer docs
 Key: HADOOP-19175
 URL: https://issues.apache.org/jira/browse/HADOOP-19175
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


Update s3a committer docs

* declare that magic committer is stable and make it the recommended one
* show how to use new command "mapred successfile" to print the success file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19174) Tez and hive jobs fail due to google's protobuf 2.5.0 in classpath

2024-05-14 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846230#comment-17846230
 ] 

Steve Loughran commented on HADOOP-19174:
-

* which hadoop version
 * what happens if you remove the hadoop protobuf-2.5 jar.

it should be cuttable from 3.4.0 unless you need the hbase 1 timeline server. 
If you are using an older release, upgrade first

> Tez and hive jobs fail due to google's protobuf 2.5.0 in classpath
> --
>
> Key: HADOOP-19174
> URL: https://issues.apache.org/jira/browse/HADOOP-19174
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> There are two issues here:
> *1. We are running tez 0.10.3 which uses hadoop 3.3.6 version. Tez has 
> protobuf version 3.21.1*
> Below is the exception we get. This is due to protobuf-2.5.0 in our hadoop 
> classpath
> {code:java}
> java.lang.IllegalAccessError: class 
> org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto tried to access 
> private field com.google.protobuf.AbstractMessage.memoizedSize 
> (org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto and 
> com.google.protobuf.AbstractMessage are in unnamed module of loader 'app')
> at 
> org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.getSerializedSize(DAGProtos.java:21636)
> at 
> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
> at org.apache.tez.common.TezUtils.writeConfInPB(TezUtils.java:170)
> at org.apache.tez.common.TezUtils.createByteStringFromConf(TezUtils.java:83)
> at org.apache.tez.common.TezUtils.createUserPayloadFromConf(TezUtils.java:101)
> at org.apache.tez.dag.app.DAGAppMaster.serviceInit(DAGAppMaster.java:436)
> at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at org.apache.tez.dag.app.DAGAppMaster$9.run(DAGAppMaster.java:2600)
> at 
> java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
> at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
> at 
> org.apache.tez.dag.app.DAGAppMaster.initAndStartAppMaster(DAGAppMaster.java:2597)
> at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2384)
> 2024-04-18 16:27:54,741 [INFO] [shutdown-hook-0] |app.DAGAppMaster|: 
> DAGAppMasterShutdownHook invoked
> 2024-04-18 16:27:54,743 [INFO] [shutdown-hook-0] |service.AbstractService|: 
> Service org.apache.tez.dag.app.DAGAppMaster failed in state STOPPED
> java.lang.NullPointerException: Cannot invoke 
> "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because 
> "this.taskSchedulerManager" is null
> at org.apache.tez.dag.app.DAGAppMaster.initiateStop(DAGAppMaster.java:2111)
> at org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:2126)
> at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
> at 
> org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterShutdownHook.run(DAGAppMaster.java:2432)
> at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base/java.lang.Thread.run(Thread.java:840)
> 2024-04-18 16:27:54,744 [WARN] [Thread-2] |util.ShutdownHookManager|: 
> ShutdownHook 'DAGAppMasterShutdownHook' failed, 
> java.util.concurrent.ExecutionException: java.lang.NullPointerException: 
> Cannot invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" 
> because "this.taskSchedulerManager" is null
> java.util.concurrent.ExecutionException: java.lang.NullPointerException: 
> Cannot invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" 
> because "this.taskSchedulerManager" is null
> at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205)
> at 
> org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
> at 
> org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
> Caused by: java.lang.NullPointerException: Cannot invoke 
> "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because 
> "this.taskSchedulerManager" is null
> at org.apache.tez.dag.app.DAGAppMaster.initiateStop(DAGAppMaster.java:2111)
> at org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:2126)
> at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220)
> at 
> 

[jira] [Commented] (HADOOP-19165) Explore dropping protobuf 2.5.0 from the distro

2024-05-13 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846021#comment-17846021
 ] 

Steve Loughran commented on HADOOP-19165:
-

relates to HADOOP-18487, where I tried to do most of this, but still couidn't 
stop it cropping up in yarn. 

> Explore dropping protobuf 2.5.0 from the distro
> ---
>
> Key: HADOOP-19165
> URL: https://issues.apache.org/jira/browse/HADOOP-19165
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Priority: Major
>
> explore if protobuf-2.5.0 can be dropped from distro, it is a transitive 
> dependency from HBase, but HBase doesn't use it in the code.
> Check if it is the only one pulling it into the distro & will something break 
> if we exclude that, if none lets get rid of it



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720

2024-05-13 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-19172:
---

Assignee: Steve Loughran

> Upgrade aws-java-sdk to 1.12.720
> 
>
> Key: HADOOP-19172
> URL: https://issues.apache.org/jira/browse/HADOOP-19172
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> Update to the latest AWS SDK, to stop anyone worrying about the ion library 
> CVE https://nvd.nist.gov/vuln/detail/CVE-2024-21634
> This isn't exposed in the s3a client, but may be used downstream. 
> on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720

2024-05-13 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19172:
---

 Summary: Upgrade aws-java-sdk to 1.12.720
 Key: HADOOP-19172
 URL: https://issues.apache.org/jira/browse/HADOOP-19172
 Project: Hadoop Common
  Issue Type: Improvement
  Components: build, fs/s3
Affects Versions: 3.3.6, 3.4.0
Reporter: Steve Loughran


Update to the latest AWS SDK, to stop anyone worrying about the ion library CVE 
https://nvd.nist.gov/vuln/detail/CVE-2024-21634

This isn't exposed in the s3a client, but may be used downstream. 

on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19171) S3A: handle alternative forms of connection failure

2024-05-13 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845921#comment-17845921
 ] 

Steve Loughran commented on HADOOP-19171:
-

stack 2

{code}

Caused by: org.apache.hadoop.fs.s3a.AWSBadRequestException: Writing Object on 
 : com.amazonaws.services.s3.model.AmazonS3Exception: Your socket 
connection to the server was not read from or written to within the timeout 
period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; 
Error Code: RequestTimeout; Request ID: :RequestTimeout: Your socket connection 
to the server was not read from or written to within the timeout period. Idle 
connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: 
RequestTimeout; 
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:244)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:119)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:322)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:414)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:318)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:293)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:209)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:564)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:552)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:219)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:219)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Your socket 
connection to the server was not read from or written to within the timeout 
period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; 
Error Code: RequestTimeout; 
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1879)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1418)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1387)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
at 
com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:421)
at 
com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:6531)
at 
com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1861)
at 
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1821)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$putObjectDirect$17(S3AFileSystem.java:2782)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfSupplier(IOStatisticsBinding.java:604)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.putObjectDirect(S3AFileSystem.java:2779)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$putObject$7(WriteOperationHelper.java:567)
at 
org.apache.hadoop.fs.store.audit.AuditingFunctions.lambda$withinAuditSpan$0(AuditingFunctions.java:62)

{code}


> S3A: handle alternative forms of connection 

[jira] [Commented] (HADOOP-19171) S3A: handle alternative forms of connection failure

2024-05-13 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845920#comment-17845920
 ] 

Steve Loughran commented on HADOOP-19171:
-

stack 1


{code}
Caused by: org.apache.hadoop.fs.s3a.AWSClientIOException: upload part #1 upload 
ID X: com.amazonaws.SdkClientException: Unable to execute HTTP request: 
Broken pipe (Write failed): Unable to execute HTTP request: Broken pipe (Write 
failed)
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:209)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:119)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:322)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:414)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:318)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:293)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:209)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.uploadPart(WriteOperationHelper.java:660)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream$MultiPartUpload.lambda$uploadBlockAsync$0(S3ABlockOutputStream.java:807)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:219)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$CallableWithPermitRelease.call(SemaphoredDelegatingExecutor.java:219)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: 
Broken pipe (Write failed)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
at 
com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:3887)
at 
com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:3872)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.uploadPart(S3AFileSystem.java:2827)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.lambda$uploadPart$10(WriteOperationHelper.java:665)
at 
org.apache.hadoop.fs.store.audit.AuditingFunctions.lambda$withinAuditSpan$0(AuditingFunctions.java:62)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117)

{code}


> S3A: handle alternative forms of connection failure
> ---
>
> Key: HADOOP-19171
> URL: https://issues.apache.org/jira/browse/HADOOP-19171
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> We've had reports of network connection failures surfacing deeper in the 
> stack where we don't convert to AWSApiCallTimeoutException so they aren't 
> retried properly (retire connection and repeat)
> {code}
> Unable to execute HTTP request: Broken pipe (Write failed)
> {code}
> {code}
>  Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, 

[jira] [Created] (HADOOP-19171) AWS v2: handle alternative forms of connection failure

2024-05-13 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19171:
---

 Summary: AWS v2: handle alternative forms of connection failure
 Key: HADOOP-19171
 URL: https://issues.apache.org/jira/browse/HADOOP-19171
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.6, 3.4.0
Reporter: Steve Loughran


We've had reports of network connection failures surfacing deeper in the stack 
where we don't convert to AWSApiCallTimeoutException so they aren't retried 
properly (retire connection and repeat)


{code}
Unable to execute HTTP request: Broken pipe (Write failed)
{code}


{code}
 Your socket connection to the server was not read from or written to within 
the timeout period. Idle connections will be closed. (Service: Amazon S3; 
Status Code: 400; Error Code: RequestTimeout
{code}





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19171) S3A: handle alternative forms of connection failure

2024-05-13 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19171:

Summary: S3A: handle alternative forms of connection failure  (was: AWS v2: 
handle alternative forms of connection failure)

> S3A: handle alternative forms of connection failure
> ---
>
> Key: HADOOP-19171
> URL: https://issues.apache.org/jira/browse/HADOOP-19171
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> We've had reports of network connection failures surfacing deeper in the 
> stack where we don't convert to AWSApiCallTimeoutException so they aren't 
> retried properly (retire connection and repeat)
> {code}
> Unable to execute HTTP request: Broken pipe (Write failed)
> {code}
> {code}
>  Your socket connection to the server was not read from or written to within 
> the timeout period. Idle connections will be closed. (Service: Amazon S3; 
> Status Code: 400; Error Code: RequestTimeout
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19161) S3A: option "fs.s3a.performance.flags" to take list of performance flags

2024-05-07 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844398#comment-17844398
 ] 

Steve Loughran commented on HADOOP-19161:
-

HADOOP-18544 is the delete optimisation. we know this is very brittle so could 
maybe split

delete-no-parent-recreate

so distinguish from any future delete optimisations, such as skipping a LIST 
for delete(recursive=false)

> S3A: option "fs.s3a.performance.flags" to take list of performance flags
> 
>
> Key: HADOOP-19161
> URL: https://issues.apache.org/jira/browse/HADOOP-19161
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> HADOOP-19072 shows we want to add more optimisations than that of 
> HADOOP-18930.
> * Extending the new optimisations to the existing option is brittle
> * Adding explicit options for each feature gets complext fast.
> Proposed
> * A new class S3APerformanceFlags keeps all the flags
> * it build this from a string[] of values, which can be extracted from 
> getConf(),
> * and it can also support a "*" option to mean "everything"
> * this class can also be handed off to hasPathCapability() and do the right 
> thing.
> Proposed optimisations
> * create file (we will hook up HADOOP-18930)
> * mkdir (HADOOP-19072)
> * delete (probe for parent path)
> * rename (probe for source path)
> We could think of more, with different names, later.
> The goal is make it possible to strip out every HTTP request we do for 
> safety/posix compliance, so applications have the option of turning off what 
> they don't need.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19163) Upgrade protobuf version to 3.24.4

2024-05-06 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843730#comment-17843730
 ] 

Steve Loughran commented on HADOOP-19163:
-

i'm on vacation this week.

> Upgrade protobuf version to 3.24.4
> --
>
> Key: HADOOP-19163
> URL: https://issues.apache.org/jira/browse/HADOOP-19163
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hadoop-thirdparty
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19164) Hadoop CLI MiniCluster is broken

2024-05-04 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17843489#comment-17843489
 ] 

Steve Loughran commented on HADOOP-19164:
-

* be nice to cut the need for mockito out
* and we should be able to add a test for this in 
https://github.com/apache/hadoop-release-support  ... something which jsut 
tries to issue the command in the unzipped distro dir

and yes "NOTE: You will need protoc 2.5.0 installed." is out. maybe we should 
grep the docs for "protobuf"

> Hadoop CLI MiniCluster is broken
> 
>
> Key: HADOOP-19164
> URL: https://issues.apache.org/jira/browse/HADOOP-19164
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: Ayush Saxena
>Priority: Major
>
> Documentation is also broken & it doesn't work either
> (https://apache.github.io/hadoop/hadoop-project-dist/hadoop-common/CLIMiniCluster.html)
> *Fails with:*
> {noformat}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/mockito/stubbing/Answer
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:2666)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:2680)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1510)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:989)
>   at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:588)
>   at 
> org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:530)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:160)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132)
>   at 
> org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320)
> Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
>   ... 9 more{noformat}
> {*}Command executed:{*}
> {noformat}
> bin/mapred minicluster -format{noformat}
> *Documentation Issues:*
> {noformat}
> bin/mapred minicluster -rmport RM_PORT -jhsport JHS_PORT{noformat}
> Without -format option it doesn't work the first time telling Namenode isn't 
> formatted, So, this should be corrected.
> {noformat}
> 2024-05-04 00:35:52,933 WARN namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: NameNode is not formatted.
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:253)
> {noformat}
> This isn't required either:
> {noformat}
> NOTE: You will need protoc 2.5.0 installed.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19161) S3A: option "fs.s3a.performance.flags" to take list of performance flags

2024-05-02 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19161:

Summary: S3A: option "fs.s3a.performance.flags" to take list of performance 
flags  (was: S3A: support a comma separated list of performance flags)

> S3A: option "fs.s3a.performance.flags" to take list of performance flags
> 
>
> Key: HADOOP-19161
> URL: https://issues.apache.org/jira/browse/HADOOP-19161
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.1
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> HADOOP-19072 shows we want to add more optimisations than that of 
> HADOOP-18930.
> * Extending the new optimisations to the existing option is brittle
> * Adding explicit options for each feature gets complext fast.
> Proposed
> * A new class S3APerformanceFlags keeps all the flags
> * it build this from a string[] of values, which can be extracted from 
> getConf(),
> * and it can also support a "*" option to mean "everything"
> * this class can also be handed off to hasPathCapability() and do the right 
> thing.
> Proposed optimisations
> * create file (we will hook up HADOOP-18930)
> * mkdir (HADOOP-19072)
> * delete (probe for parent path)
> * rename (probe for source path)
> We could think of more, with different names, later.
> The goal is make it possible to strip out every HTTP request we do for 
> safety/posix compliance, so applications have the option of turning off what 
> they don't need.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19161) S3A: support a comma separated list of performance flags

2024-05-02 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19161:
---

 Summary: S3A: support a comma separated list of performance flags
 Key: HADOOP-19161
 URL: https://issues.apache.org/jira/browse/HADOOP-19161
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 3.4.1
Reporter: Steve Loughran
Assignee: Steve Loughran


HADOOP-19072 shows we want to add more optimisations than that of HADOOP-18930.

* Extending the new optimisations to the existing option is brittle
* Adding explicit options for each feature gets complext fast.

Proposed
* A new class S3APerformanceFlags keeps all the flags
* it build this from a string[] of values, which can be extracted from 
getConf(),
* and it can also support a "*" option to mean "everything"
* this class can also be handed off to hasPathCapability() and do the right 
thing.

Proposed optimisations
* create file (we will hook up HADOOP-18930)
* mkdir (HADOOP-19072)
* delete (probe for parent path)
* rename (probe for source path)

We could think of more, with different names, later.
The goal is make it possible to strip out every HTTP request we do for 
safety/posix compliance, so applications have the option of turning off what 
they don't need.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19160) hadoop-auth should not depend on kerb-simplekdc

2024-05-02 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19160:

Affects Version/s: 3.4.0

> hadoop-auth should not depend on kerb-simplekdc
> ---
>
> Key: HADOOP-19160
> URL: https://issues.apache.org/jira/browse/HADOOP-19160
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: auth
>Affects Versions: 3.4.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> HADOOP-16179 attempted to remove dependency on {{kerb-simplekdc}} from 
> {{hadoop-common}}.  However, {{hadoop-auth}} still has a compile-scope 
> dependency on the same, and {{hadoop-common}} proper depends on 
> {{hadoop-auth}}.  So {{kerb-simplekdc}} is still a transitive dependency of 
> {{hadoop-common}}.
> {code}
> [INFO] --- maven-dependency-plugin:3.0.2:tree (default-cli) @ hadoop-common 
> ---
> [INFO] org.apache.hadoop:hadoop-common:jar:3.5.0-SNAPSHOT
> ...
> [INFO] +- org.apache.hadoop:hadoop-auth:jar:3.5.0-SNAPSHOT:compile
> ...
> [INFO] |  \- org.apache.kerby:kerb-simplekdc:jar:2.0.3:compile
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19146.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 

[jira] [Updated] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-30 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19146:

Priority: Minor  (was: Major)

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 
> 

[jira] [Commented] (HADOOP-19107) Drop support for HBase v1 & upgrade HBase v2

2024-04-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842024#comment-17842024
 ] 

Steve Loughran commented on HADOOP-19107:
-

yay!
# add in release notes?
# backport to 3.4.1?
# does this mean we can strip out parquet 2.5 from our redistributed artifacts?

> Drop support for HBase v1 & upgrade HBase v2
> 
>
> Key: HADOOP-19107
> URL: https://issues.apache.org/jira/browse/HADOOP-19107
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Drop support for Hbase V1 and make building Hbase v2 default.
> Dev List:
> [https://lists.apache.org/thread/vb2gh5ljwncbrmqnk0oflb8ftdz64hhs]
> https://lists.apache.org/thread/o88hnm7q8n3b4bng81q14vsj3fbhfx5w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads

2024-04-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-19159:
---

Assignee: Xi Chen

> Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
> --
>
> Key: HADOOP-19159
> URL: https://issues.apache.org/jira/browse/HADOOP-19159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Xi Chen
>Assignee: Xi Chen
>Priority: Minor
>  Labels: pull-request-available
>
> The description about `fs.s3a.committer.abort.pending.uploads` in the 
> _Concurrent Jobs writing to the same destination_ is not all correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads

2024-04-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17842017#comment-17842017
 ] 

Steve Loughran commented on HADOOP-19159:
-

thanks! merged to all the maintained branches

> Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
> --
>
> Key: HADOOP-19159
> URL: https://issues.apache.org/jira/browse/HADOOP-19159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.6
>Reporter: Xi Chen
>Assignee: Xi Chen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> The description about `fs.s3a.committer.abort.pending.uploads` in the 
> _Concurrent Jobs writing to the same destination_ is not all correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads

2024-04-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19159:

Affects Version/s: 3.3.6

> Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
> --
>
> Key: HADOOP-19159
> URL: https://issues.apache.org/jira/browse/HADOOP-19159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.6
>Reporter: Xi Chen
>Assignee: Xi Chen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> The description about `fs.s3a.committer.abort.pending.uploads` in the 
> _Concurrent Jobs writing to the same destination_ is not all correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads

2024-04-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19159.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
   Resolution: Fixed

> Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
> --
>
> Key: HADOOP-19159
> URL: https://issues.apache.org/jira/browse/HADOOP-19159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Xi Chen
>Assignee: Xi Chen
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> The description about `fs.s3a.committer.abort.pending.uploads` in the 
> _Concurrent Jobs writing to the same destination_ is not all correct.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19158) S3A: Support ByteBufferPositionedReadable through vector IO

2024-04-25 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19158:

Summary: S3A: Support ByteBufferPositionedReadable through vector IO  (was: 
S3A: Support delegating ByteBufferPositionedReadable through vector IO)

> S3A: Support ByteBufferPositionedReadable through vector IO
> ---
>
> Key: HADOOP-19158
> URL: https://issues.apache.org/jira/browse/HADOOP-19158
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> Make it easy for any stream with vector io to support 
> {{ByteBufferPositionedReadable}}
> Specifically, {{ByteBufferPositionedReadable.readFully()}}
> is exactly a single range read so is easy to read.
> the simpler read() call which can return less isn't part of the vector API.
> Proposed: invoke the readFully() but convert an EOFException to -1 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19158) S3A: Support delegating ByteBufferPositionedReadable through vector IO

2024-04-25 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19158:

Summary: S3A: Support delegating ByteBufferPositionedReadable through 
vector IO  (was: Support delegating ByteBufferPositionedReadable to vector 
reads)

> S3A: Support delegating ByteBufferPositionedReadable through vector IO
> --
>
> Key: HADOOP-19158
> URL: https://issues.apache.org/jira/browse/HADOOP-19158
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> Make it easy for any stream with vector io to suppor
> Specifically, 
> ByteBufferPositionedReadable.readFully()
> is exactly a single range read so is easy to read.
> the simpler read() call which can return less isn't part of the vector API.
> Proposed: invoke the readFully() but convert an EOFException to -1 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19158) S3A: Support delegating ByteBufferPositionedReadable through vector IO

2024-04-25 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19158:

Description: 
Make it easy for any stream with vector io to support 
{{ByteBufferPositionedReadable}}

Specifically, {{ByteBufferPositionedReadable.readFully()}}

is exactly a single range read so is easy to read.

the simpler read() call which can return less isn't part of the vector API.
Proposed: invoke the readFully() but convert an EOFException to -1 

  was:
Make it easy for any stream with vector io to suppor

Specifically, 

ByteBufferPositionedReadable.readFully()

is exactly a single range read so is easy to read.

the simpler read() call which can return less isn't part of the vector API.
Proposed: invoke the readFully() but convert an EOFException to -1 


> S3A: Support delegating ByteBufferPositionedReadable through vector IO
> --
>
> Key: HADOOP-19158
> URL: https://issues.apache.org/jira/browse/HADOOP-19158
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> Make it easy for any stream with vector io to support 
> {{ByteBufferPositionedReadable}}
> Specifically, {{ByteBufferPositionedReadable.readFully()}}
> is exactly a single range read so is easy to read.
> the simpler read() call which can return less isn't part of the vector API.
> Proposed: invoke the readFully() but convert an EOFException to -1 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19158) Support delegating ByteBufferPositionedReadable to vector reads

2024-04-25 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19158:
---

 Summary: Support delegating ByteBufferPositionedReadable to vector 
reads
 Key: HADOOP-19158
 URL: https://issues.apache.org/jira/browse/HADOOP-19158
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran


Make it easy for any stream with vector io to suppor

Specifically, 

ByteBufferPositionedReadable.readFully()

is exactly a single range read so is easy to read.

the simpler read() call which can return less isn't part of the vector API.
Proposed: invoke the readFully() but convert an EOFException to -1 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19157) [ABFS] Filesystem contract tests to use methodPath for robust parallel test runs

2024-04-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840122#comment-17840122
 ] 

Steve Loughran commented on HADOOP-19157:
-

note: this is not a problem with abfs -it just has the most ambitious test 
runner.

{code}

[ERROR] 
testMkdirsPopulatingAllNonexistentAncestors(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractMkdir)
  Time elapsed: 0.475 s  <<< ERROR!
java.io.FileNotFoundException: 
abfs://stevel-test...@stevelukwest.dfs.core.windows.net/fork-0002/test/testMkdirsPopulatingAllNonexistentAncestors/a/b/c/d/e/f/g/h/i/j/k/L
 nested dir should exist: not found 
abfs://stevel-test...@stevelukwest.dfs.core.windows.net/fork-0002/test/testMkdirsPopulatingAllNonexistentAncestors/a/b/c/d/e/f/g/h/i/j/k/L
 in 
abfs://stevel-test...@stevelukwest.dfs.core.windows.net/fork-0002/test/testMkdirsPopulatingAllNonexistentAncestors/a/b/c/d/e/f/g/h/i/j/k
at 
org.apache.hadoop.fs.contract.ContractTestUtils.verifyPathExists(ContractTestUtils.java:985)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.assertPathExists(ContractTestUtils.java:963)
at 
org.apache.hadoop.fs.contract.AbstractFSContractTestBase.assertPathExists(AbstractFSContractTestBase.java:319)
at 
org.apache.hadoop.fs.contract.AbstractContractMkdirTest.testMkdirsPopulatingAllNonexistentAncestors(AbstractContractMkdirTest.java:150)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.FileNotFoundException: Operation failed: "The specified path 
does not exist.", 404, HEAD, 
https://stevelukwest.dfs.core.windows.net/stevel-testing/fork-0002/test/testMkdirsPopulatingAllNonexistentAncestors/a/b/c/d/e/f/g/h/i/j/k/L?upn=false=getStatus=90s,
 rId: 50a0ad90-f01f-0065-688c-95083600
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.checkException(AzureBlobFileSystem.java:1503)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:736)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:724)
at 
org.apache.hadoop.fs.contract.ContractTestUtils.verifyPathExists(ContractTestUtils.java:979)
... 18 more
Caused by: Operation failed: "The specified path does not exist.", 404, HEAD, 
https://stevelukwest.dfs.core.windows.net/stevel-testing/fork-0002/test/testMkdirsPopulatingAllNonexistentAncestors/a/b/c/d/e/f/g/h/i/j/k/L?upn=false=getStatus=90s,
 rId: 50a0ad90-f01f-0065-688c-95083600
at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:270)
at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:216)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.measureDurationOfInvocation(IOStatisticsBinding.java:494)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:465)
at 
org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:214)
at 
org.apache.hadoop.fs.azurebfs.services.AbfsClient.getPathStatus(AbfsClient.java:1083)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:1115)
at 
org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:734)
... 20 more

[ERROR] 
testNoMkdirOverFile(org.apache.hadoop.fs.azurebfs.contract.ITestAbfsFileSystemContractMkdir)
  Time elapsed: 0.437 s  <<< ERROR!
java.io.FileNotFoundException: Operation failed: "The specified path does not 
exist.", 404, HEAD, 

[jira] [Created] (HADOOP-19157) [ABFS] Filesystem contract tests to use methodPath for robust parallel test runs

2024-04-23 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19157:
---

 Summary: [ABFS] Filesystem contract tests to use methodPath for 
robust parallel test runs
 Key: HADOOP-19157
 URL: https://issues.apache.org/jira/browse/HADOOP-19157
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure, test
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran


hadoop-azure supports parallel test runs, but unlike hadoop-aws, the azure ones 
are parallelised across methods in the same test suites.

this can fail badly where contract tests have hard coded filenames and assume 
that they can use this across all test cases. Shows up when you are testing on 
a store with reduced IO capacity triggering retries and making some test cases 
slower

Fix: hadoop-common contract tests to use methodPath() names



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19102) [ABFS]: FooterReadBufferSize should not be greater than readBufferSize

2024-04-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19102.
-
Fix Version/s: 3.5.0
   3.4.1
   Resolution: Fixed

> [ABFS]: FooterReadBufferSize should not be greater than readBufferSize
> --
>
> Key: HADOOP-19102
> URL: https://issues.apache.org/jira/browse/HADOOP-19102
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.0
>Reporter: Pranav Saxena
>Assignee: Pranav Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> The method `optimisedRead` creates a buffer array of size `readBufferSize`. 
> If footerReadBufferSize is greater than readBufferSize, abfs will attempt to 
> read more data than the buffer array can hold, which causes an exception.
> Change: To avoid this, we will keep footerBufferSize = 
> min(readBufferSizeConfig, footerBufferSizeConfig)
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19085) Compatibility Benchmark over HCFS Implementations

2024-04-22 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17839704#comment-17839704
 ] 

Steve Loughran commented on HADOOP-19085:
-

that's really interesting. abfs has full filesystem semantics; s3 doesn't and 
we always trade off correctness for performance.

* can you attach the results?
* regarding other connectors, gcs is the obvious one

> Compatibility Benchmark over HCFS Implementations
> -
>
> Key: HADOOP-19085
> URL: https://issues.apache.org/jira/browse/HADOOP-19085
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, test
>Affects Versions: 3.4.0
>Reporter: Han Liu
>Assignee: Han Liu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
> Attachments: HADOOP-19085.001.patch, HDFS Compatibility Benchmark 
> Design.pdf
>
>
> {*}Background:{*}Hadoop-Compatible File System (HCFS) is a core conception in 
> big data storage ecosystem, providing unified interfaces and generally clear 
> semantics, and has become the de-factor standard for industry storage systems 
> to follow and conform with. There have been a series of HCFS implementations 
> in Hadoop, such as S3AFileSystem for Amazon's S3 Object Store, WASB for 
> Microsoft's Azure Blob Storage and OSS connector for Alibaba Cloud Object 
> Storage, and more from storage service's providers on their own.
> {*}Problems:{*}However, as indicated by introduction.md, there is no formal 
> suite to do compatibility assessment of a file system for all such HCFS 
> implementations. Thus, whether the functionality is well accomplished and 
> meets the core compatible expectations mainly relies on service provider's 
> own report. Meanwhile, Hadoop is also developing and new features are 
> continuously contributing to HCFS interfaces for existing implementations to 
> follow and update, in which case, Hadoop also needs a tool to quickly assess 
> if these features are supported or not for a specific HCFS implementation. 
> Besides, the known hadoop command line tool or hdfs shell is used to directly 
> interact with a HCFS storage system, where most commands correspond to 
> specific HCFS interfaces and work well. Still, there are cases that are 
> complicated and may not work, like expunge command. To check such commands 
> for an HCFS, we also need an approach to figure them out.
> {*}Proposal:{*}Accordingly, we propose to define a formal HCFS compatibility 
> benchmark and provide corresponding tool to do the compatibility assessment 
> for an HCFS storage system. The benchmark and tool should consider both HCFS 
> interfaces and hdfs shell commands. Different scenarios require different 
> kinds of compatibilities. For such consideration, we could define different 
> suites in the benchmark.
> *Benefits:* We intend the benchmark and tool to be useful for both storage 
> providers and storage users. For end users, it can be used to evalute the 
> compatibility level and determine if the storage system in question is 
> suitable for the required scenarios. For storage providers, it helps to 
> quickly generate an objective and reliable report about core functioins of 
> the storage service. As an instance, if the HCFS got a 100% on a suite named 
> 'tpcds', it is demonstrated that all functions needed by a tpcds program have 
> been well achieved. It is also a guide indicating how storage service 
> abilities can map to HCFS interfaces, such as storage class on S3.
> Any thoughts? Comments and feedback are mostly welcomed. Thanks in advance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19083) provide hadoop binary tarball without aws v2 sdk

2024-04-19 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19083:

Description: 
Have the default hadoop binary .tar.gz exclude the aws v2 sdk by default. 

This SDK brings the total size of the distribution to about 1 GB.

Proposed
* add a profile to include the aws sdk in the dist module
* document it for local building
* for release builds, we modify our release ant builds to generate modified x86 
and arm64 releases without the file.





  was:
Have the default hadoop binary .tar.gz exclude the aws v2 sdk by default. 

This SDK brings the total size of the distribution to about 1 GB.

Proposed
* add a profile to include the aws sdk in the dist module
* disable it by default

Instead we document which version is needed. 
The hadoop-aws and hadoop-cloud storage maven artifacts will declare their 
dependencies, so apps building with those get to do the download.




> provide hadoop binary tarball without aws v2 sdk
> 
>
> Key: HADOOP-19083
> URL: https://issues.apache.org/jira/browse/HADOOP-19083
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> Have the default hadoop binary .tar.gz exclude the aws v2 sdk by default. 
> This SDK brings the total size of the distribution to about 1 GB.
> Proposed
> * add a profile to include the aws sdk in the dist module
> * document it for local building
> * for release builds, we modify our release ant builds to generate modified 
> x86 and arm64 releases without the file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19154) upgrade bouncy castle to 1.78.1 due to CVEs

2024-04-19 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19154:

Affects Version/s: 3.3.6
   3.4.0

> upgrade bouncy castle to 1.78.1 due to CVEs
> ---
>
> Key: HADOOP-19154
> URL: https://issues.apache.org/jira/browse/HADOOP-19154
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.4.0, 3.3.6
>Reporter: PJ Fanning
>Priority: Major
>
> [https://www.bouncycastle.org/releasenotes.html#r1rv78]
> There is a v1.78.1 release but no notes for it yet.
> For v1.78
> h3. 2.1.5 Security Advisories.
> Release 1.78 deals with the following CVEs:
>  * CVE-2024-29857 - Importing an EC certificate with specially crafted F2m 
> parameters can cause high CPU usage during parameter evaluation.
>  * CVE-2024-30171 - Possible timing based leakage in RSA based handshakes due 
> to exception processing eliminated.
>  * CVE-2024-30172 - Crafted signature and public key can be used to trigger 
> an infinite loop in the Ed25519 verification code.
>  * CVE-2024-301XX - When endpoint identification is enabled and an SSL socket 
> is not created with an explicit hostname (as happens with 
> HttpsURLConnection), hostname verification could be performed against a 
> DNS-resolved IP address. This has been fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19153) hadoop-common still exports logback as a transitive dependency

2024-04-17 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19153:
---

 Summary: hadoop-common still exports logback as a transitive 
dependency
 Key: HADOOP-19153
 URL: https://issues.apache.org/jira/browse/HADOOP-19153
 Project: Hadoop Common
  Issue Type: Bug
  Components: build, common
Affects Versions: 3.4.0
Reporter: Steve Loughran


Even though HADOOP-19084 set out to stop it, somehow ZK's declaration of a 
logback dependency is still contaminating the hadoop-common dependency graph, 
so causing problems downstream.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19084) prune dependency exports of hadoop-* modules

2024-04-17 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17838241#comment-17838241
 ] 

Steve Loughran commented on HADOOP-19084:
-

logback is still being exported by hadoop-common via zk. 

> prune dependency exports of hadoop-* modules
> 
>
> Key: HADOOP-19084
> URL: https://issues.apache.org/jira/browse/HADOOP-19084
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.5.0, 3.4.1
>
>
> this is probably caused by HADOOP-18613:
> ZK is pulling in some extra transitive stuff which surfaces in applications 
> which import hadoop-common into their poms. It doesn't seem to show up in our 
> distro, but downstream you get warnings about duplicate logging stuff
> {code}
> |  +- org.apache.zookeeper:zookeeper:jar:3.8.3:compile
> |  |  +- org.apache.zookeeper:zookeeper-jute:jar:3.8.3:compile
> |  |  |  \- (org.apache.yetus:audience-annotations:jar:0.12.0:compile - 
> omitted for duplicate)
> |  |  +- org.apache.yetus:audience-annotations:jar:0.12.0:compile
> |  |  +- (io.netty:netty-handler:jar:4.1.94.Final:compile - omitted for 
> conflict with 4.1.100.Final)
> |  |  +- (io.netty:netty-transport-native-epoll:jar:4.1.94.Final:compile - 
> omitted for conflict with 4.1.100.Final)
> |  |  +- (org.slf4j:slf4j-api:jar:1.7.30:compile - omitted for duplicate)
> |  |  +- ch.qos.logback:logback-core:jar:1.2.10:compile
> |  |  +- ch.qos.logback:logback-classic:jar:1.2.10:compile
> |  |  |  +- (ch.qos.logback:logback-core:jar:1.2.10:compile - omitted for 
> duplicate)
> |  |  |  \- (org.slf4j:slf4j-api:jar:1.7.32:compile - omitted for conflict 
> with 1.7.30)
> |  |  \- (commons-io:commons-io:jar:2.11.0:compile - omitted for conflict 
> with 2.14.0)
> {code}
> proposed: exclude the zk dependencies we either override outselves or don't 
> need. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19150) Test ITestAbfsRestOperationException#testAuthFailException is broken.

2024-04-16 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837883#comment-17837883
 ] 

Steve Loughran commented on HADOOP-19150:
-

actually, it should be 

{code}

AbfsRestOperationException e = intercept(AbfsRestOperationException, () -> 
fs.getFileStatus(new Path("/")));

+ all the asserts on the exception

{code}


> Test ITestAbfsRestOperationException#testAuthFailException is broken. 
> --
>
> Key: HADOOP-19150
> URL: https://issues.apache.org/jira/browse/HADOOP-19150
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Mukund Thakur
>Priority: Major
>
> {code:java}
> intercept(Exception.class,
> () -> {
>   fs.getFileStatus(new Path("/"));
> }); {code}
> Intercept shouldn't be used as there are assertions in catch statements. 
>  
> CC [~ste...@apache.org]  [~anujmodi2021] [~asrani_anmol] 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19025) Migrate abstract contract tests to AssertJ

2024-04-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19025:

Affects Version/s: 3.4.0

> Migrate abstract contract tests to AssertJ
> --
>
> Key: HADOOP-19025
> URL: https://issues.apache.org/jira/browse/HADOOP-19025
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 3.4.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available
>
> Replace JUnit4 assertions with equivalent functionality from AssertJ, to make 
> contract tests more independent of JUnit version.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18296) Memory fragmentation in ChecksumFileSystem Vectored IO implementation.

2024-04-15 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17837385#comment-17837385
 ] 

Steve Loughran commented on HADOOP-18296:
-

Mukund, do we actually need to coalesce ranges on local fs reads? because it is 
all local. we can just push out a list of independent regions.

we do still need to deal with failures by adding the ability to return buffers 
to any pool on failure.

> Memory fragmentation in ChecksumFileSystem Vectored IO implementation.
> --
>
> Key: HADOOP-18296
> URL: https://issues.apache.org/jira/browse/HADOOP-18296
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Mukund Thakur
>Priority: Minor
>  Labels: fs
>
> As we have implemented merging of ranges in the ChecksumFSInputChecker 
> implementation of vectored IO api, it can lead to memory fragmentation. Let 
> me explain by example.
>  
> Suppose client requests for 3 ranges. 
> 0-500, 700-1000 and 1200-1500.
> Now because of merging, all the above ranges will get merged into one and we 
> will allocate a big byte buffer of 0-1500 size but return sliced byte buffers 
> for the desired ranges.
> Now once the client is done reading all the ranges, it will only be able to 
> free the memory for requested ranges and memory of the gaps will never be 
> released for eg here (500-700 and 1000-1200).
>  
> Note this only happens for direct byte buffers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19082) S3A: Update AWS SDK V2 to 2.24.6

2024-04-12 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17836675#comment-17836675
 ] 

Steve Loughran commented on HADOOP-19082:
-

FYI this SDK has an unshaded copy of org.slf4j.LoggerFactory  in it; which is 
not what anyone wants

> S3A: Update AWS SDK V2 to 2.24.6
> 
>
> Key: HADOOP-19082
> URL: https://issues.apache.org/jira/browse/HADOOP-19082
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Harshit Gupta
>Assignee: Harshit Gupta
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> Update the AWS SDK to 2.24.6 from 2.23.5 for latest updates in packaging 
> w.r.t. imds module.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19079) HttpExceptionUtils to check that loaded class is really an exception before instantiation

2024-04-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19079.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
   Resolution: Fixed

> HttpExceptionUtils to check that loaded class is really an exception before 
> instantiation
> -
>
> Key: HADOOP-19079
> URL: https://issues.apache.org/jira/browse/HADOOP-19079
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common, security
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> It can be dangerous taking class names as inputs from HTTP messages even if 
> we control the source. Issue is in HttpExceptionUtils in hadoop-common 
> (validateResponse method).
> I can provide a PR that will highlight the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19079) HttpExceptionUtils to check that loaded class is really an exception before instantiation

2024-04-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19079:

Summary: HttpExceptionUtils to check that loaded class is really an 
exception before instantiation  (was: check that class that is loaded is really 
an exception)

> HttpExceptionUtils to check that loaded class is really an exception before 
> instantiation
> -
>
> Key: HADOOP-19079
> URL: https://issues.apache.org/jira/browse/HADOOP-19079
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common, security
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
>
> It can be dangerous taking class names as inputs from HTTP messages even if 
> we control the source. Issue is in HttpExceptionUtils in hadoop-common 
> (validateResponse method).
> I can provide a PR that will highlight the issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19096) [ABFS] Enhancing Client-Side Throttling Metrics Updation Logic

2024-04-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19096.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> [ABFS] Enhancing Client-Side Throttling Metrics Updation Logic
> --
>
> Key: HADOOP-19096
> URL: https://issues.apache.org/jira/browse/HADOOP-19096
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.4.1
>Reporter: Anuj Modi
>Assignee: Anuj Modi
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> ABFS has a client-side throttling mechanism which works on the metrics 
> collected from past requests made. I requests are getting failed due to 
> throttling at server, we update our metrics and client side backoff is 
> calculated based on those metrics.
> This PR enhances the logic to decide which requests should be considered to 
> compute client side backoff interval as follows:
> For each request made by ABFS driver, we will determine if they should 
> contribute to Client-Side Throttling based on the status code and result:
>  # Status code in 2xx range: Successful Operations should contribute.
>  # Status code in 3xx range: Redirection Operations should not contribute.
>  # Status code in 4xx range: User Errors should not contribute.
>  # Status code is 503: Throttling Error should contribute only if they are 
> due to client limits breach as follows:
>  ## 503, Ingress Over Account Limit: Should Contribute
>  ## 503, Egress Over Account Limit: Should Contribute
>  ## 503, TPS Over Account Limit: Should Contribute
>  ## 503, Other Server Throttling: Should not Contribute.
>  # Status code in 5xx range other than 503: Should not Contribute.
>  # IOException and UnknownHostExceptions: Should not Contribute.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19105) S3A: Recover from Vector IO read failures

2024-04-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19105:

Environment: 
s3a vector IO doesn't try to recover from read failures the way read() does.

Need to
* abort HTTP stream if considered needed
* retry active read which failed
* but not those which had succeeded

On a full failure we need to do something about any allocated buffer, which 
means we really need the buffer pool {{ByteBufferPool}} to return or also 
provide a "release" (Bytebuffer -> void) call which does the return.  we would 
need to
* add this as a new api with the implementations in s3a, local, rawlocal
* classic single allocator method remaps to the new one with (() -> null) as 
the response

This keeps the public API stable



  was:
s3a vector IO doesn't try to recover from read failures the way read() does.

Need to
* abort HTTP stream if considered needed
* retry active read which failed
* but not those which had succeeded




> S3A: Recover from Vector IO read failures
> -
>
> Key: HADOOP-19105
> URL: https://issues.apache.org/jira/browse/HADOOP-19105
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0, 3.3.6
> Environment: s3a vector IO doesn't try to recover from read failures 
> the way read() does.
> Need to
> * abort HTTP stream if considered needed
> * retry active read which failed
> * but not those which had succeeded
> On a full failure we need to do something about any allocated buffer, which 
> means we really need the buffer pool {{ByteBufferPool}} to return or also 
> provide a "release" (Bytebuffer -> void) call which does the return.  we 
> would need to
> * add this as a new api with the implementations in s3a, local, rawlocal
> * classic single allocator method remaps to the new one with (() -> null) as 
> the response
> This keeps the public API stable
>Reporter: Steve Loughran
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19101) Vectored Read into off-heap buffer broken in fallback implementation

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19101:

Release Note: 
PositionedReadable.readVectored() will read incorrect data when reading from 
hdfs, azure abfs and other stores when given a direct buffer allocator. 

For cross-version compatibility, use on-heap buffer allocators only

> Vectored Read into off-heap buffer broken in fallback implementation
> 
>
> Key: HADOOP-19101
> URL: https://issues.apache.org/jira/browse/HADOOP-19101
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> {{VectoredReadUtils.readInDirectBuffer()}} always starts off reading at 
> position zero even when the range is at a different offset. As a result: you 
> can get incorrect information.
> Thanks for this is straightforward: we pass in a FileRange and use its offset 
> as the starting position.
> However, this does mean that all shipping releases 3.3.5-3.4.0 cannot safely 
> read vectorIO into direct buffers through HDFS, ABFS or GCS. Note that we 
> have never seen this in production because the parquet and ORC libraries both 
> read into on-heap storage.
> Those libraries needs to be audited to make sure that they never attempt to 
> read into off-heap DirectBuffers. This is a bit trickier than you would think 
> because an allocator is passed in. For PARQUET-2171 we will 
> * only invoke the API on streams which explicitly declare their support for 
> the API (so fallback in parquet itself)
> * not invoke when direct buffer allocation is in use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19101) Vectored Read into off-heap buffer broken in fallback implementation

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19101.
-
Fix Version/s: 3.3.9
   3.4.1
   Resolution: Fixed

> Vectored Read into off-heap buffer broken in fallback implementation
> 
>
> Key: HADOOP-19101
> URL: https://issues.apache.org/jira/browse/HADOOP-19101
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> {{VectoredReadUtils.readInDirectBuffer()}} always starts off reading at 
> position zero even when the range is at a different offset. As a result: you 
> can get incorrect information.
> Thanks for this is straightforward: we pass in a FileRange and use its offset 
> as the starting position.
> However, this does mean that all shipping releases 3.3.5-3.4.0 cannot safely 
> read vectorIO into direct buffers through HDFS, ABFS or GCS. Note that we 
> have never seen this in production because the parquet and ORC libraries both 
> read into on-heap storage.
> Those libraries needs to be audited to make sure that they never attempt to 
> read into off-heap DirectBuffers. This is a bit trickier than you would think 
> because an allocator is passed in. For PARQUET-2171 we will 
> * only invoke the API on streams which explicitly declare their support for 
> the API (so fallback in parquet itself)
> * not invoke when direct buffer allocation is in use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19098.
-
Resolution: Fixed

> Vector IO: consistent specified rejection of overlapping ranges
> ---
>
> Key: HADOOP-19098
> URL: https://issues.apache.org/jira/browse/HADOOP-19098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Related to PARQUET-2171 q: "how do you deal with overlapping ranges?"
> I believe s3a rejects this, but the other impls may not.
> Proposed
> FS spec to say 
> * "overlap triggers IllegalArgumentException". 
> * special case: 0 byte ranges may be short circuited to return empty buffer 
> even without checking file length etc.
> Contract tests to validate this
> (+ common helper code to do this).
> I'll copy the validation stuff into the parquet PR for consistency with older 
> releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19098:

Fix Version/s: 3.3.9

> Vector IO: consistent specified rejection of overlapping ranges
> ---
>
> Key: HADOOP-19098
> URL: https://issues.apache.org/jira/browse/HADOOP-19098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> Related to PARQUET-2171 q: "how do you deal with overlapping ranges?"
> I believe s3a rejects this, but the other impls may not.
> Proposed
> FS spec to say 
> * "overlap triggers IllegalArgumentException". 
> * special case: 0 byte ranges may be short circuited to return empty buffer 
> even without checking file length etc.
> Contract tests to validate this
> (+ common helper code to do this).
> I'll copy the validation stuff into the parquet PR for consistency with older 
> releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16822) Provide source artifacts for hadoop-client-api

2024-04-10 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-16822:

Description: 
h5. Improvement request
The third-party libraries shading hadoop-client-api (& hadoop-client-runtime) 
artifacts are super useful.
 
Having uber source jar for hadoop-client-api (maybe even hadoop-client-runtime) 
would be great for downstream development & debugging purposes.

Are there any obstacles or objections against providing fat jar with all the 
hadoop client api as well ?

h5. Dev links
- *maven-shaded-plugin* and its *shadeSourcesContent* attribute
- 
https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#shadeSourcesContent

h2. Update April 2024: this has been reverted.

It turns out that it complicates debugging. If you want the source when 
debugging, the best way is just to check out the hadoop release you are working 
with and point your IDE at it.

  was:
h5. Improvement request
The third-party libraries shading hadoop-client-api (& hadoop-client-runtime) 
artifacts are super useful.
 
Having uber source jar for hadoop-client-api (maybe even hadoop-client-runtime) 
would be great for downstream development & debugging purposes.

Are there any obstacles or objections against providing fat jar with all the 
hadoop client api as well ?

h5. Dev links
- *maven-shaded-plugin* and its *shadeSourcesContent* attribute
- 
https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#shadeSourcesContent


> Provide source artifacts for hadoop-client-api
> --
>
> Key: HADOOP-16822
> URL: https://issues.apache.org/jira/browse/HADOOP-16822
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.3.1, 3.4.0, 3.2.3
>Reporter: Karel Kolman
>Assignee: Karel Kolman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: HADOOP-16822-hadoop-client-api-source-jar.patch
>
>
> h5. Improvement request
> The third-party libraries shading hadoop-client-api (& hadoop-client-runtime) 
> artifacts are super useful.
>  
> Having uber source jar for hadoop-client-api (maybe even 
> hadoop-client-runtime) would be great for downstream development & debugging 
> purposes.
> Are there any obstacles or objections against providing fat jar with all the 
> hadoop client api as well ?
> h5. Dev links
> - *maven-shaded-plugin* and its *shadeSourcesContent* attribute
> - 
> https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#shadeSourcesContent
> h2. Update April 2024: this has been reverted.
> It turns out that it complicates debugging. If you want the source when 
> debugging, the best way is just to check out the hadoop release you are 
> working with and point your IDE at it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19119) spotbugs complaining about possible NPE in org.apache.hadoop.crypto.key.kms.ValueQueue.getSize()

2024-04-05 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19119:

Fix Version/s: 3.3.9

> spotbugs complaining about possible NPE in 
> org.apache.hadoop.crypto.key.kms.ValueQueue.getSize()
> 
>
> Key: HADOOP-19119
> URL: https://issues.apache.org/jira/browse/HADOOP-19119
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: crypto
>Affects Versions: 3.5.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> PRs against hadoop-common are reporting spotbugs problems
> {code}
> Dodgy code Warnings
> Code  Warning
> NPPossible null pointer dereference in 
> org.apache.hadoop.crypto.key.kms.ValueQueue.getSize(String) due to return 
> value of called method
> Bug type NP_NULL_ON_SOME_PATH_FROM_RETURN_VALUE (click for details)
> In class org.apache.hadoop.crypto.key.kms.ValueQueue
> In method org.apache.hadoop.crypto.key.kms.ValueQueue.getSize(String)
> Local variable stored in JVM register ?
> Dereferenced at ValueQueue.java:[line 332]
> Known null at ValueQueue.java:[line 332]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19119) spotbugs complaining about possible NPE in org.apache.hadoop.crypto.key.kms.ValueQueue.getSize()

2024-04-05 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19119:

Affects Version/s: 3.4.0
   3.3.9

> spotbugs complaining about possible NPE in 
> org.apache.hadoop.crypto.key.kms.ValueQueue.getSize()
> 
>
> Key: HADOOP-19119
> URL: https://issues.apache.org/jira/browse/HADOOP-19119
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: crypto
>Affects Versions: 3.4.0, 3.3.9, 3.5.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> PRs against hadoop-common are reporting spotbugs problems
> {code}
> Dodgy code Warnings
> Code  Warning
> NPPossible null pointer dereference in 
> org.apache.hadoop.crypto.key.kms.ValueQueue.getSize(String) due to return 
> value of called method
> Bug type NP_NULL_ON_SOME_PATH_FROM_RETURN_VALUE (click for details)
> In class org.apache.hadoop.crypto.key.kms.ValueQueue
> In method org.apache.hadoop.crypto.key.kms.ValueQueue.getSize(String)
> Local variable stored in JVM register ?
> Dereferenced at ValueQueue.java:[line 332]
> Known null at ValueQueue.java:[line 332]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges

2024-04-05 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19098:

Fix Version/s: 3.4.1

> Vector IO: consistent specified rejection of overlapping ranges
> ---
>
> Key: HADOOP-19098
> URL: https://issues.apache.org/jira/browse/HADOOP-19098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> Related to PARQUET-2171 q: "how do you deal with overlapping ranges?"
> I believe s3a rejects this, but the other impls may not.
> Proposed
> FS spec to say 
> * "overlap triggers IllegalArgumentException". 
> * special case: 0 byte ranges may be short circuited to return empty buffer 
> even without checking file length etc.
> Contract tests to validate this
> (+ common helper code to do this).
> I'll copy the validation stuff into the parquet PR for consistency with older 
> releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18656) ABFS: Support for Pagination in Recursive Directory Delete

2024-04-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18656:

Fix Version/s: 3.5.0

> ABFS: Support for Pagination in Recursive Directory Delete 
> ---
>
> Key: HADOOP-18656
> URL: https://issues.apache.org/jira/browse/HADOOP-18656
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.5
>Reporter: Sree Bhattacharyya
>Assignee: Anuj Modi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Today, when a recursive delete is issued for a large directory in ADLS Gen2 
> (HNS) account, the directory deletion happens in O(1) but in backend ACL 
> Checks are done recursively for each object inside that directory which in 
> case of large directory could lead to request time out. Pagination is 
> introduced in the Azure Storage Backend for these ACL checks.
> More information on how pagination works can be found on public documentation 
> of [Azure Delete Path 
> API|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/delete?view=rest-storageservices-datalakestoragegen2-2019-12-12].
> This PR contains changes to support this from client side. To trigger 
> pagination, client needs to add a new query parameter "paginated" and set it 
> to true along with recursive set to true. In return if the directory is 
> large, server might return a continuation token back to the caller. If caller 
> gets back a continuation token, it has to call the delete API again with 
> continuation token along with recursive and pagination set to true. This is 
> similar to directory delete of FNS account.
> Pagination is available only in versions "2023-08-03" onwards.
> PR also contains functional tests to verify driver works well with different 
> combinations of recursive and pagination features for HNS.
> Full E2E testing of pagination requires large dataset to be created and hence 
> not added as part of driver test suite. But extensive E2E testing has been 
> performed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18656) ABFS: Support for Pagination in Recursive Directory Delete

2024-04-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18656:

Target Version/s: 3.3.9, 3.5.0, 3.4.1  (was: 3.3.9, 3.5.0)

> ABFS: Support for Pagination in Recursive Directory Delete 
> ---
>
> Key: HADOOP-18656
> URL: https://issues.apache.org/jira/browse/HADOOP-18656
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.5
>Reporter: Sree Bhattacharyya
>Assignee: Anuj Modi
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Today, when a recursive delete is issued for a large directory in ADLS Gen2 
> (HNS) account, the directory deletion happens in O(1) but in backend ACL 
> Checks are done recursively for each object inside that directory which in 
> case of large directory could lead to request time out. Pagination is 
> introduced in the Azure Storage Backend for these ACL checks.
> More information on how pagination works can be found on public documentation 
> of [Azure Delete Path 
> API|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/delete?view=rest-storageservices-datalakestoragegen2-2019-12-12].
> This PR contains changes to support this from client side. To trigger 
> pagination, client needs to add a new query parameter "paginated" and set it 
> to true along with recursive set to true. In return if the directory is 
> large, server might return a continuation token back to the caller. If caller 
> gets back a continuation token, it has to call the delete API again with 
> continuation token along with recursive and pagination set to true. This is 
> similar to directory delete of FNS account.
> Pagination is available only in versions "2023-08-03" onwards.
> PR also contains functional tests to verify driver works well with different 
> combinations of recursive and pagination features for HNS.
> Full E2E testing of pagination requires large dataset to be created and hence 
> not added as part of driver test suite. But extensive E2E testing has been 
> performed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19141) Update VectorIO default values consistently

2024-04-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19141:

Fix Version/s: 3.5.0

> Update VectorIO default values consistently
> ---
>
> Key: HADOOP-19141
> URL: https://issues.apache.org/jira/browse/HADOOP-19141
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.4.1
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.7, 3.5.0, 3.4.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18855) VectorIO API tuning/stabilization

2024-04-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18855:

Description: 
Changes needed to get the Vector IO code stable.

Specifically
* consistent behaviour across implementations
* broader testing
* resilience

+Ideally, abfs support. (s3a prefetching needs this too; see HADOOP-19144)

This work will be shaped by the experience of merging support into libraries 
and identifying issues/improvement opportunities

  was:
Changes needed to get the Vector IO code stable.

Specifically
* consistent behaviour across implementations
* broader testing
* resilience

+Ideally, abfs support. (s3a prefetching needs this too; see HADOOP-19144)


> VectorIO API tuning/stabilization
> -
>
> Key: HADOOP-18855
> URL: https://issues.apache.org/jira/browse/HADOOP-18855
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3
>Affects Versions: 3.4.0, 3.3.9
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> Changes needed to get the Vector IO code stable.
> Specifically
> * consistent behaviour across implementations
> * broader testing
> * resilience
> +Ideally, abfs support. (s3a prefetching needs this too; see HADOOP-19144)
> This work will be shaped by the experience of merging support into libraries 
> and identifying issues/improvement opportunities



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18855) VectorIO API tuning/stabilization

2024-04-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18855:

Description: 
Changes needed to get the Vector IO code stable.

Specifically
* consistent behaviour across implementations
* broader testing
* resilience

+Ideally, abfs support. (s3a prefetching needs this too; see HADOOP-19144)

  was:
what do do we need to do to improve the vector iO experience based on 
integration and use.

obviously, we cannot change anything incompatibly, but we may find bugs to fix 
and other possible enhancements


> VectorIO API tuning/stabilization
> -
>
> Key: HADOOP-18855
> URL: https://issues.apache.org/jira/browse/HADOOP-18855
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs, fs/s3
>Affects Versions: 3.4.0, 3.3.9
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> Changes needed to get the Vector IO code stable.
> Specifically
> * consistent behaviour across implementations
> * broader testing
> * resilience
> +Ideally, abfs support. (s3a prefetching needs this too; see HADOOP-19144)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19144) S3A prefetching to support Vector IO

2024-04-04 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19144:
---

 Summary: S3A prefetching to support Vector IO
 Key: HADOOP-19144
 URL: https://issues.apache.org/jira/browse/HADOOP-19144
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


Add explicit support for vector IO in s3a prefetching stream.

* if a range is in 1+ cached block, it SHALL be read from cache and returned
* if a range is not in cache : TBD
* If a range is partially in cache: TBD

these are the same decisions that abfs has to make: should the client 
fetch/cache block or just do one or more GET requests

A big issue is: does caching of data fetched in a range request make any sense 
at all? Or more specifically: does fetching the blocks in which range requests 
are found make sense

Simply going to the store is a lot simpler



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19101) Vectored Read into off-heap buffer broken in fallback implementation

2024-04-04 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19101:

Fix Version/s: 3.5.0

> Vectored Read into off-heap buffer broken in fallback implementation
> 
>
> Key: HADOOP-19101
> URL: https://issues.apache.org/jira/browse/HADOOP-19101
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Fix For: 3.5.0
>
>
> {{VectoredReadUtils.readInDirectBuffer()}} always starts off reading at 
> position zero even when the range is at a different offset. As a result: you 
> can get incorrect information.
> Thanks for this is straightforward: we pass in a FileRange and use its offset 
> as the starting position.
> However, this does mean that all shipping releases 3.3.5-3.4.0 cannot safely 
> read vectorIO into direct buffers through HDFS, ABFS or GCS. Note that we 
> have never seen this in production because the parquet and ORC libraries both 
> read into on-heap storage.
> Those libraries needs to be audited to make sure that they never attempt to 
> read into off-heap DirectBuffers. This is a bit trickier than you would think 
> because an allocator is passed in. For PARQUET-2171 we will 
> * only invoke the API on streams which explicitly declare their support for 
> the API (so fallback in parquet itself)
> * not invoke when direct buffer allocation is in use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19140) [ABFS, S3A] Add IORateLimiter api to hadoop common

2024-04-03 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19140:
---

 Summary: [ABFS, S3A] Add IORateLimiter api to hadoop common
 Key: HADOOP-19140
 URL: https://issues.apache.org/jira/browse/HADOOP-19140
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/azure, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Steve Loughran


Create a rate limiter API in hadoop common which code (initially, manifest 
committer, bulk delete).. can request iO capacity for a specific operation.

this can be exported by filesystems so support shared rate limiting across all 
threads

pulled from HADOOP-19093 PR



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19124) Update org.ehcache from 3.3.1 to 3.8.2.

2024-04-03 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833640#comment-17833640
 ] 

Steve Loughran commented on HADOOP-19124:
-

should we include in branch-3.4? I'd like to bring things up to date there

> Update org.ehcache from 3.3.1 to 3.8.2.
> ---
>
> Key: HADOOP-19124
> URL: https://issues.apache.org/jira/browse/HADOOP-19124
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Affects Versions: 3.4.1
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> We need to enhance the caching functionality in Yarn Federation by adding a 
> limit on the number of cached entries. I noticed that the version of 
> org.ehcache is relatively old and requires an upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19114) upgrade to commons-compress 1.26.1 due to cves

2024-04-03 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19114:

Description: 
2 recent CVEs fixed - 
https://mvnrepository.com/artifact/org.apache.commons/commons-compress


Important: Denial of Service CVE-2024-25710
Moderate: Denial of Service CVE-2024-26308



  was:2 recent CVEs fixed - 
https://mvnrepository.com/artifact/org.apache.commons/commons-compress


> upgrade to commons-compress 1.26.1 due to cves
> --
>
> Key: HADOOP-19114
> URL: https://issues.apache.org/jira/browse/HADOOP-19114
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, CVE
>Affects Versions: 3.4.0
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
>
> 2 recent CVEs fixed - 
> https://mvnrepository.com/artifact/org.apache.commons/commons-compress
> Important: Denial of Service CVE-2024-25710
> Moderate: Denial of Service CVE-2024-26308



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges

2024-04-02 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17833295#comment-17833295
 ] 

Steve Loughran commented on HADOOP-19098:
-

fixed in 3.5; will backport to 3.4 and 3.3

> Vector IO: consistent specified rejection of overlapping ranges
> ---
>
> Key: HADOOP-19098
> URL: https://issues.apache.org/jira/browse/HADOOP-19098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Related to PARQUET-2171 q: "how do you deal with overlapping ranges?"
> I believe s3a rejects this, but the other impls may not.
> Proposed
> FS spec to say 
> * "overlap triggers IllegalArgumentException". 
> * special case: 0 byte ranges may be short circuited to return empty buffer 
> even without checking file length etc.
> Contract tests to validate this
> (+ common helper code to do this).
> I'll copy the validation stuff into the parquet PR for consistency with older 
> releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges

2024-04-02 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19098:

Fix Version/s: 3.5.0

> Vector IO: consistent specified rejection of overlapping ranges
> ---
>
> Key: HADOOP-19098
> URL: https://issues.apache.org/jira/browse/HADOOP-19098
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> Related to PARQUET-2171 q: "how do you deal with overlapping ranges?"
> I believe s3a rejects this, but the other impls may not.
> Proposed
> FS spec to say 
> * "overlap triggers IllegalArgumentException". 
> * special case: 0 byte ranges may be short circuited to return empty buffer 
> even without checking file length etc.
> Contract tests to validate this
> (+ common helper code to do this).
> I'll copy the validation stuff into the parquet PR for consistency with older 
> releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-19115) upgrade to nimbus-jose-jwt 9.37.2 due to CVE

2024-04-02 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-19115.
-
Fix Version/s: 3.3.9
   3.5.0
   3.4.1
 Assignee: PJ Fanning
   Resolution: Fixed

> upgrade to nimbus-jose-jwt 9.37.2 due to CVE
> 
>
> Key: HADOOP-19115
> URL: https://issues.apache.org/jira/browse/HADOOP-19115
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, CVE
>Affects Versions: 3.4.0, 3.5.0
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.9, 3.5.0, 3.4.1
>
>
> https://github.com/advisories/GHSA-gvpg-vgmx-xg6w



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19133) "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI property

2024-04-01 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832940#comment-17832940
 ] 

Steve Loughran commented on HADOOP-19133:
-

thanks. looks like "removeBaseAndBucketOverrides" should be clever about 
handling undefined bucket binding and/or test set things up better

> "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI 
> property
> ---
>
> Key: HADOOP-19133
> URL: https://issues.apache.org/jira/browse/HADOOP-19133
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3, test
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Attila Doroszlai
>Priority: Minor
>
> ITestS3AContractVectoredRead fails with {{NullPointerException: No test 
> bucket}} if test bucket is defined as {{-Dtest.fs.s3a.name=...}} via CLI , 
> not in {{auth-keys.xml}}.  The same setup works for other S3A contract tests. 
>  Tested on 3.3.6.
> {code:title=src/test/resources/auth-keys.xml}
> 
>   
> fs.s3a.endpoint
> ${test.fs.s3a.endpoint}
>   
>   
> fs.contract.test.fs.s3a
> ${test.fs.s3a.name}
>   
> 
> {code}
> {code}
> export AWS_ACCESS_KEY_ID=''
> export AWS_SECRET_KEY=''
> mvn -Dtest=ITestS3AContractVectoredRead -Dtest.fs.s3a.name="s3a://mybucket" 
> -Dtest.fs.s3a.endpoint="http://localhost:9878/; clean test
> {code}
> {code:title=test results}
> Tests run: 46, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 7.879 s <<< 
> FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead
> testMinSeekAndMaxSizeDefaultValues[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 1.95 s  <<< ERROR!
> java.lang.NullPointerException: No test bucket
>   at org.apache.hadoop.util.Preconditions.checkNotNull(Preconditions.java:88)
>   at 
> org.apache.hadoop.fs.s3a.S3ATestUtils.getTestBucketName(S3ATestUtils.java:714)
>   at 
> org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides(S3ATestUtils.java:775)
>   at 
> org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead.testMinSeekAndMaxSizeDefaultValues(ITestS3AContractVectoredRead.java:104)
>   ...
> testMinSeekAndMaxSizeConfigsPropagation[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.176 s  <<< ERROR!
> testMultiVectoredReadStatsCollection[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.179 s  <<< ERROR!
> testNormalReadVsVectoredReadStatsCollection[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.155 s  <<< ERROR!
> testMinSeekAndMaxSizeDefaultValues[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.116 s  <<< ERROR!
> testMinSeekAndMaxSizeConfigsPropagation[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.102 s  <<< ERROR!
> testMultiVectoredReadStatsCollection[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.105 s  <<< ERROR!
> testNormalReadVsVectoredReadStatsCollection[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.107 s  <<< ERROR!
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19133) "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI property

2024-04-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19133:

Affects Version/s: 3.3.6
   3.4.0

> "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI 
> property
> ---
>
> Key: HADOOP-19133
> URL: https://issues.apache.org/jira/browse/HADOOP-19133
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3, test
>Affects Versions: 3.4.0, 3.3.6
>Reporter: Attila Doroszlai
>Priority: Minor
>
> ITestS3AContractVectoredRead fails with {{NullPointerException: No test 
> bucket}} if test bucket is defined as {{-Dtest.fs.s3a.name=...}} via CLI , 
> not in {{auth-keys.xml}}.  The same setup works for other S3A contract tests. 
>  Tested on 3.3.6.
> {code:title=src/test/resources/auth-keys.xml}
> 
>   
> fs.s3a.endpoint
> ${test.fs.s3a.endpoint}
>   
>   
> fs.contract.test.fs.s3a
> ${test.fs.s3a.name}
>   
> 
> {code}
> {code}
> export AWS_ACCESS_KEY_ID=''
> export AWS_SECRET_KEY=''
> mvn -Dtest=ITestS3AContractVectoredRead -Dtest.fs.s3a.name="s3a://mybucket" 
> -Dtest.fs.s3a.endpoint="http://localhost:9878/; clean test
> {code}
> {code:title=test results}
> Tests run: 46, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 7.879 s <<< 
> FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead
> testMinSeekAndMaxSizeDefaultValues[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 1.95 s  <<< ERROR!
> java.lang.NullPointerException: No test bucket
>   at org.apache.hadoop.util.Preconditions.checkNotNull(Preconditions.java:88)
>   at 
> org.apache.hadoop.fs.s3a.S3ATestUtils.getTestBucketName(S3ATestUtils.java:714)
>   at 
> org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides(S3ATestUtils.java:775)
>   at 
> org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead.testMinSeekAndMaxSizeDefaultValues(ITestS3AContractVectoredRead.java:104)
>   ...
> testMinSeekAndMaxSizeConfigsPropagation[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.176 s  <<< ERROR!
> testMultiVectoredReadStatsCollection[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.179 s  <<< ERROR!
> testNormalReadVsVectoredReadStatsCollection[Buffer type : 
> direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.155 s  <<< ERROR!
> testMinSeekAndMaxSizeDefaultValues[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.116 s  <<< ERROR!
> testMinSeekAndMaxSizeConfigsPropagation[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.102 s  <<< ERROR!
> testMultiVectoredReadStatsCollection[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.105 s  <<< ERROR!
> testNormalReadVsVectoredReadStatsCollection[Buffer type : 
> array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead)  Time 
> elapsed: 0.107 s  <<< ERROR!
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19131) Assist reflection IO with WrappedOperations class

2024-03-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19131:

Summary: Assist reflection IO with WrappedOperations class  (was: Assist 
reflection iO with WrappedOperations class)

> Assist reflection IO with WrappedOperations class
> -
>
> Key: HADOOP-19131
> URL: https://issues.apache.org/jira/browse/HADOOP-19131
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> parquet, avro etc are still stuck building with older hadoop releases. 
> This makes using new APIs hard (PARQUET-2117) and means that APIs which are 5 
> years old (!) such as HADOOP-15229 just aren't picked up.
> This lack of openFIle() adoption hurts working with files in cloud storage as
> * extra HEAD requests are made
> * read policies can't be explicitly set
> * split start/end can't be passed down
> Proposed
> # create class org.apache.hadoop.io.WrappedOperations
> # add methods to wrap the apis
> # test in contract tests via reflection loading -verifies we have done it 
> properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19131) Assist reflection IO with WrappedOperations class

2024-03-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-19131:

Description: 
parquet, avro etc are still stuck building with older hadoop releases. 

This makes using new APIs hard (PARQUET-2171) and means that APIs which are 5 
years old such as HADOOP-15229 just aren't picked up.

This lack of openFIle() adoption hurts working with files in cloud storage as
* extra HEAD requests are made
* read policies can't be explicitly set
* split start/end can't be passed down

Proposed
# create class org.apache.hadoop.io.WrappedOperations
# add methods to wrap the apis
# test in contract tests via reflection loading -verifies we have done it 
properly.

  was:
parquet, avro etc are still stuck building with older hadoop releases. 

This makes using new APIs hard (PARQUET-2117) and means that APIs which are 5 
years old (!) such as HADOOP-15229 just aren't picked up.

This lack of openFIle() adoption hurts working with files in cloud storage as
* extra HEAD requests are made
* read policies can't be explicitly set
* split start/end can't be passed down

Proposed
# create class org.apache.hadoop.io.WrappedOperations
# add methods to wrap the apis
# test in contract tests via reflection loading -verifies we have done it 
properly.


> Assist reflection IO with WrappedOperations class
> -
>
> Key: HADOOP-19131
> URL: https://issues.apache.org/jira/browse/HADOOP-19131
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> parquet, avro etc are still stuck building with older hadoop releases. 
> This makes using new APIs hard (PARQUET-2171) and means that APIs which are 5 
> years old such as HADOOP-15229 just aren't picked up.
> This lack of openFIle() adoption hurts working with files in cloud storage as
> * extra HEAD requests are made
> * read policies can't be explicitly set
> * split start/end can't be passed down
> Proposed
> # create class org.apache.hadoop.io.WrappedOperations
> # add methods to wrap the apis
> # test in contract tests via reflection loading -verifies we have done it 
> properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19131) Assist reflection iO with WrappedOperations class

2024-03-28 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-19131:
---

 Summary: Assist reflection iO with WrappedOperations class
 Key: HADOOP-19131
 URL: https://issues.apache.org/jira/browse/HADOOP-19131
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs, fs/azure, fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


parquet, avro etc are still stuck building with older hadoop releases. 

This makes using new APIs hard (PARQUET-2117) and means that APIs which are 5 
years old (!) such as HADOOP-15229 just aren't picked up.

This lack of openFIle() adoption hurts working with files in cloud storage as
* extra HEAD requests are made
* read policies can't be explicitly set
* split start/end can't be passed down

Proposed
# create class org.apache.hadoop.io.WrappedOperations
# add methods to wrap the apis
# test in contract tests via reflection loading -verifies we have done it 
properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19131) Assist reflection iO with WrappedOperations class

2024-03-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-19131:
---

Assignee: Steve Loughran

> Assist reflection iO with WrappedOperations class
> -
>
> Key: HADOOP-19131
> URL: https://issues.apache.org/jira/browse/HADOOP-19131
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> parquet, avro etc are still stuck building with older hadoop releases. 
> This makes using new APIs hard (PARQUET-2117) and means that APIs which are 5 
> years old (!) such as HADOOP-15229 just aren't picked up.
> This lack of openFIle() adoption hurts working with files in cloud storage as
> * extra HEAD requests are made
> * read policies can't be explicitly set
> * split start/end can't be passed down
> Proposed
> # create class org.apache.hadoop.io.WrappedOperations
> # add methods to wrap the apis
> # test in contract tests via reflection loading -verifies we have done it 
> properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >