[jira] [Resolved] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-09-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-1.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> S3A. createS3AsyncClient() always enables multipart
> ---
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
> multipart enabled; if it is disabled in s3a config it should be disabled here 
> and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18901) [Java17] Create maven profile for running unit tests

2023-09-15 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765659#comment-17765659
 ] 

Steve Loughran commented on HADOOP-18901:
-

+whatever options are needed for forked failsafe/surefire processes to work

note also that forked terasort test runs play up too. presumably the fixes to 
the MR job launcher will get that

> [Java17] Create maven profile for running unit tests
> 
>
> Key: HADOOP-18901
> URL: https://issues.apache.org/jira/browse/HADOOP-18901
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Bruno Ramirez
>Priority: Major
>
> Added build plugin to use JAVA17_HOME for testing while using JDK 17 profile



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18901) [Java17] Create maven profile for running unit tests

2023-09-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18901:

Summary: [Java17] Create maven profile for running unit tests  (was: Create 
maven profile for running unit tests)

> [Java17] Create maven profile for running unit tests
> 
>
> Key: HADOOP-18901
> URL: https://issues.apache.org/jira/browse/HADOOP-18901
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Bruno Ramirez
>Priority: Major
>
> Added build plugin to use JAVA17_HOME for testing while using JDK 17 profile



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18904) get local file system fails with a casting error

2023-09-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18904:

Affects Version/s: 3.3.6

> get local file system fails with a casting error
> 
>
> Key: HADOOP-18904
> URL: https://issues.apache.org/jira/browse/HADOOP-18904
> Project: Hadoop Common
>  Issue Type: Bug
>Affects Versions: 3.3.6
>Reporter: ConfX
>Priority: Trivial
>  Labels: pull-request-available
>
> h2. What happened
> After setting {{{}fs.file.impl=org.apache.hadoop.fs.RawLocalFileSystem{}}}, 
> trying to acquire local file system using the {{getLocal}} in 
> {{org.apache.hadoop.fs}} fails with {{java.lang.ClassCastException}}
> h2. Where's the bug
> In the function {{getLocal}} of {{FileSystem}} in HCommon:
> {code:java}
>   public static LocalFileSystem getLocal(Configuration conf)
>     throws IOException {
>     return (LocalFileSystem)get(LocalFileSystem.NAME, conf);
>   } {code}
> the returned file system is directly cast to LocalFileSystem without 
> checking. If the user set the implementation of the local filesystem to be 
> Raw rather than Checksum, this type cast would fail.
> h2. How to reproduce
>  # Set {{fs.file.impl=org.apache.hadoop.fs.RawLocalFileSystem}}
>  # Run the following test in HBase: 
> {{org.apache.hadoop.hbase.TestHBaseTestingUtility#testMiniDFSCluster}}
> and the following exception should be observed:
> {code:java}
> java.lang.ClassCastException: class org.apache.hadoop.fs.RawLocalFileSystem 
> cannot be cast to class org.apache.hadoop.fs.LocalFileSystem 
> (org.apache.hadoop.fs.RawLocalFileSystem and 
> org.apache.hadoop.fs.LocalFileSystem are in unnamed module of loader 'app')
>   at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:441)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.getNewDataTestDirOnTestFS(HBaseTestingUtility.java:550)
> ...{code}
> Or simply set the configuration parameter and call the method using a 
> {{Configuration}} object and the exception would be triggered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18904) get local file system fails with a casting error

2023-09-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18904:

Component/s: fs

> get local file system fails with a casting error
> 
>
> Key: HADOOP-18904
> URL: https://issues.apache.org/jira/browse/HADOOP-18904
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: ConfX
>Priority: Trivial
>  Labels: pull-request-available
>
> h2. What happened
> After setting {{{}fs.file.impl=org.apache.hadoop.fs.RawLocalFileSystem{}}}, 
> trying to acquire local file system using the {{getLocal}} in 
> {{org.apache.hadoop.fs}} fails with {{java.lang.ClassCastException}}
> h2. Where's the bug
> In the function {{getLocal}} of {{FileSystem}} in HCommon:
> {code:java}
>   public static LocalFileSystem getLocal(Configuration conf)
>     throws IOException {
>     return (LocalFileSystem)get(LocalFileSystem.NAME, conf);
>   } {code}
> the returned file system is directly cast to LocalFileSystem without 
> checking. If the user set the implementation of the local filesystem to be 
> Raw rather than Checksum, this type cast would fail.
> h2. How to reproduce
>  # Set {{fs.file.impl=org.apache.hadoop.fs.RawLocalFileSystem}}
>  # Run the following test in HBase: 
> {{org.apache.hadoop.hbase.TestHBaseTestingUtility#testMiniDFSCluster}}
> and the following exception should be observed:
> {code:java}
> java.lang.ClassCastException: class org.apache.hadoop.fs.RawLocalFileSystem 
> cannot be cast to class org.apache.hadoop.fs.LocalFileSystem 
> (org.apache.hadoop.fs.RawLocalFileSystem and 
> org.apache.hadoop.fs.LocalFileSystem are in unnamed module of loader 'app')
>   at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:441)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.getNewDataTestDirOnTestFS(HBaseTestingUtility.java:550)
> ...{code}
> Or simply set the configuration parameter and call the method using a 
> {{Configuration}} object and the exception would be triggered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18904) get local file system fails with a casting error

2023-09-15 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18904:

Priority: Trivial  (was: Major)

> get local file system fails with a casting error
> 
>
> Key: HADOOP-18904
> URL: https://issues.apache.org/jira/browse/HADOOP-18904
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: ConfX
>Priority: Trivial
>  Labels: pull-request-available
>
> h2. What happened
> After setting {{{}fs.file.impl=org.apache.hadoop.fs.RawLocalFileSystem{}}}, 
> trying to acquire local file system using the {{getLocal}} in 
> {{org.apache.hadoop.fs}} fails with {{java.lang.ClassCastException}}
> h2. Where's the bug
> In the function {{getLocal}} of {{FileSystem}} in HCommon:
> {code:java}
>   public static LocalFileSystem getLocal(Configuration conf)
>     throws IOException {
>     return (LocalFileSystem)get(LocalFileSystem.NAME, conf);
>   } {code}
> the returned file system is directly cast to LocalFileSystem without 
> checking. If the user set the implementation of the local filesystem to be 
> Raw rather than Checksum, this type cast would fail.
> h2. How to reproduce
>  # Set {{fs.file.impl=org.apache.hadoop.fs.RawLocalFileSystem}}
>  # Run the following test in HBase: 
> {{org.apache.hadoop.hbase.TestHBaseTestingUtility#testMiniDFSCluster}}
> and the following exception should be observed:
> {code:java}
> java.lang.ClassCastException: class org.apache.hadoop.fs.RawLocalFileSystem 
> cannot be cast to class org.apache.hadoop.fs.LocalFileSystem 
> (org.apache.hadoop.fs.RawLocalFileSystem and 
> org.apache.hadoop.fs.LocalFileSystem are in unnamed module of loader 'app')
>   at org.apache.hadoop.fs.FileSystem.getLocal(FileSystem.java:441)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.getNewDataTestDirOnTestFS(HBaseTestingUtility.java:550)
> ...{code}
> Or simply set the configuration parameter and call the method using a 
> {{Configuration}} object and the exception would be triggered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18883) Expect-100 JDK bug resolution: prevent multiple server calls

2023-09-14 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765286#comment-17765286
 ] 

Steve Loughran commented on HADOOP-18883:
-

thanks. 

> Expect-100 JDK bug resolution: prevent multiple server calls
> 
>
> Key: HADOOP-18883
> URL: https://issues.apache.org/jira/browse/HADOOP-18883
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Pranav Saxena
>Assignee: Pranav Saxena
>Priority: Major
> Fix For: 3.4.0
>
>
> This is inline to JDK bug: [https://bugs.openjdk.org/browse/JDK-8314978].
>  
> With the current implementation of HttpURLConnection if server rejects the 
> “Expect 100-continue” then there will be ‘java.net.ProtocolException’ will be 
> thrown from 'expect100Continue()' method.
> After the exception thrown, If we call any other method on the same instance 
> (ex getHeaderField(), or getHeaderFields()). They will internally call 
> getOuputStream() which invokes writeRequests(), which make the actual server 
> call. 
> In the AbfsHttpOperation, after sendRequest() we call processResponse() 
> method from AbfsRestOperation. Even if the conn.getOutputStream() fails due 
> to expect-100 error, we consume the exception and let the code go ahead. So, 
> we can have getHeaderField() / getHeaderFields() / getHeaderFieldLong() which 
> will be triggered after getOutputStream is failed. These invocation will lead 
> to server calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-09-14 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765285#comment-17765285
 ] 

Steve Loughran commented on HADOOP-18860:
-

no, i mean it failed in hadoop client jar as the rules for doing the shading 
have now changed. they'll need to be updated (not anything I've ever done) as 
part of another iteration on this. 

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18895) upgrade to commons-compress 1.24.0 due to CVE

2023-09-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18895:

Fix Version/s: 3.4.0

> upgrade to commons-compress 1.24.0 due to CVE
> -
>
> Key: HADOOP-18895
> URL: https://issues.apache.org/jira/browse/HADOOP-18895
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Includes some important bug fixes including 
> https://lists.apache.org/thread/g9lrsz8j9nrgltcoc7v6cpkopg07czc9 - 
> CVE-2023-42503



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18895) upgrade to commons-compress 1.24.0 due to CVE

2023-09-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18895:
---

Assignee: PJ Fanning

> upgrade to commons-compress 1.24.0 due to CVE
> -
>
> Key: HADOOP-18895
> URL: https://issues.apache.org/jira/browse/HADOOP-18895
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: PJ Fanning
>Assignee: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Includes some important bug fixes including 
> https://lists.apache.org/thread/g9lrsz8j9nrgltcoc7v6cpkopg07czc9 - 
> CVE-2023-42503



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18895) upgrade to commons-compress 1.24.0 due to CVE

2023-09-14 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18895:

Component/s: build

> upgrade to commons-compress 1.24.0 due to CVE
> -
>
> Key: HADOOP-18895
> URL: https://issues.apache.org/jira/browse/HADOOP-18895
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: PJ Fanning
>Priority: Major
>  Labels: pull-request-available
>
> Includes some important bug fixes including 
> https://lists.apache.org/thread/g9lrsz8j9nrgltcoc7v6cpkopg07czc9 - 
> CVE-2023-42503



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18877) AWS SDK V2 - Move to S3 Java async client

2023-09-14 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17765234#comment-17765234
 ] 

Steve Loughran commented on HADOOP-18877:
-

the interface here needs to be pluggable so that hboss and others trying to 
mock s3 have a simple point to mock. see HBASE-28056

this means that we should make it another configuration option, just one for 
wiring up in tests. maybe:  add a way to set it before initialization; in init 
if it has been set then don't create? we could provide both getter and setter 
in S3AInternals interface.

> AWS SDK V2 - Move to S3 Java async client
> -
>
> Key: HADOOP-18877
> URL: https://issues.apache.org/jira/browse/HADOOP-18877
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Ahmar Suhail
>Priority: Major
>
> With the upgrade, S3A now has two S3 clients the Java async client and the 
> Java sync client.
> Java async is required for the transfer manager.
> Java sync is used for everything else. 
>  
> * Move all operations to use the Java async client and remove the sync 
> client. 
> * Provide option to configure java async client with the CRT HTTP client. 
> * Create a new interface for S3Client operations, move them out of S3AFS. 
> interface will take request and span, and return response.  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18887) Java 17 Runtime Support

2023-09-13 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764788#comment-17764788
 ] 

Steve Loughran commented on HADOOP-18887:
-

# happy to watch this, but not got time to review anything
# changes like jar updates are best isolated into their own jiras -maybe create 
them under the big "java17" jira but declare as a requirement for here. just 
knowing what needs fixing is a good first step; launching mr jobs with the 
right --add-opens stuff for every process launched. 
# some of my colleagues have been busy with java17; they may have some input 
here. 

> Java 17 Runtime Support
> ---
>
> Key: HADOOP-18887
> URL: https://issues.apache.org/jira/browse/HADOOP-18887
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: build
>Reporter: Bruno Ramirez
>Priority: Major
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> This JIRA feature aims to extend the Java runtime support for Hadoop to 
> include Java 17 in addition to Java 8 and Java 11. Currently Hadoop version 
> 3.3.6 supports Java 8 compile and Java 8/11 runtime. The goal is to make 
> Hadoop compatible with Java 17 runtime as well.
> The plan for this release is to allow Hadoop to default to Java 11/Java 17, 
> while still providing the flexibility for customers to configure Hadoop to 
> use Java 8, Java 11, or Java 17 based on their specific needs. This project's 
> objectives include:
>  # Certifying that Hadoop works seamlessly on Java 8/11/17 for common use 
> cases.
>  # Ensuring that running Hadoop on Java 11/17 does not disrupt other 
> applications and libraries such as Spark, Hive, Flink, Presto/Trino, and 
> HBase.
> The decision to support Java 17 runtime is motivated by customer requests and 
> significant performance improvements observed in downstream applications like 
> Apache Hive and Apache Spark. The testing process encompasses unit tests, 
> integration tests, and performance tests, as well as verifying the proper 
> functioning of all Hadoop daemons with Java 17.
> The project will address compile time issues across various Hadoop 
> components, ensuring that Hadoop remains compatible with Java 17 throughout 
> the entire codebase.
> This ticket serves as a vital step in enhancing Hadoop's capabilities, 
> providing customers with more choices and improved performance for their big 
> data processing needs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-09-13 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764785#comment-17764785
 ] 

Steve Loughran commented on HADOOP-18860:
-

[~asrani_anmol] have you got another iteration of this PR which includes the 
client lib changes needed?

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-09-12 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-1:
---

Assignee: Steve Loughran

> S3A. createS3AsyncClient() always enables multipart
> ---
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
> multipart enabled; if it is disabled in s3a config it should be disabled here 
> and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18890) remove okhttp usage

2023-09-12 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764304#comment-17764304
 ] 

Steve Loughran commented on HADOOP-18890:
-

where's it being used? i guess i can work out by commenting out the depenency

> remove okhttp usage
> ---
>
> Key: HADOOP-18890
> URL: https://issues.apache.org/jira/browse/HADOOP-18890
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Reporter: PJ Fanning
>Priority: Major
>
> * relates to HADOOP-18496
> * simplifies the dependencies if hadoop doesn't use multiple 3rd party libs 
> to make http calls
> * okhttp brings in other dependencies like the kotlin runtime
> * hadoop already uses apache httpclient in some places



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-09-12 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764271#comment-17764271
 ] 

Steve Loughran commented on HADOOP-1:
-

for the xfer manager i'm going to fix this by just doing a direct copy call, 
whenever the size is below the multipart threshold. needed for HBoss mock tests 
to work without stubbing all of async client

> S3A. createS3AsyncClient() always enables multipart
> ---
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
> multipart enabled; if it is disabled in s3a config it should be disabled here 
> and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18889) S3A: V2 SDK client does not work with third-party store

2023-09-12 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18889:
---

 Summary: S3A: V2 SDK client does not work with third-party store
 Key: HADOOP-18889
 URL: https://issues.apache.org/jira/browse/HADOOP-18889
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran



testing against an external store without specifying region now blows up 
because the region is queried off eu-west-1.

What are we do to here? require the region setting *which wasn't needed before? 
what even region do we provide for third party stores?




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18889) S3A: V2 SDK client does not work with third-party store

2023-09-12 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764238#comment-17764238
 ] 

Steve Loughran commented on HADOOP-18889:
-



{code}

2023-09-12 15:51:20,237 [main] INFO  diag.StoreDiag 
(StoreDurationInfo.java:close(137)) - Duration of Creating filesystem for 
s3a://external-store/: 0:01:172
java.nio.file.AccessDeniedException: external-store: getS3Region on 
external-store: software.amazon.awssdk.services.s3.model.S3Exception: null 
(Service: S3, Status Code: 403, Request ID: B4X3E3K7JJMW31HT, Extended Request 
ID: 
11O4OGPp95JlbmEszl7NiiMBBL73AVpgO1XdjkSvZoyjslpWj8nATQJ/5SkzXw8W1Puz/bPZ0fg=):null
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:235)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getS3Region$4(S3AFileSystem.java:1039)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2631)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getS3Region(S3AFileSystem.java:1038)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:982)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:622)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3452)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:162)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3557)
at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:3510)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:575)
at 
org.apache.hadoop.fs.store.diag.StoreDiag.executeFileSystemOperations(StoreDiag.java:755)
at org.apache.hadoop.fs.store.diag.StoreDiag.run(StoreDiag.java:241)
at org.apache.hadoop.fs.store.diag.StoreDiag.run(StoreDiag.java:176)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:81)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:95)
at org.apache.hadoop.fs.store.diag.StoreDiag.exec(StoreDiag.java:1171)
at org.apache.hadoop.fs.store.diag.StoreDiag.main(StoreDiag.java:1180)
at storediag.main(storediag.java:25)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: 
S3, Status Code: 403, Request ID: B4X3E3K7JJMW31HT, Extended Request ID: 
11O4OGPp95JlbmEszl7NiiMBBL73AVpgO1XdjkSvZoyjslpWj8nATQJ/5SkzXw8W1Puz/bPZ0fg=)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:95)
at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:270)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
  

[jira] [Commented] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-09-12 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764212#comment-17764212
 ] 

Steve Loughran commented on HADOOP-1:
-

+transfer manager is always constructed with its own thread pool. prefer to use 
the s3a one.

> S3A. createS3AsyncClient() always enables multipart
> ---
>
> Key: HADOOP-1
> URL: https://issues.apache.org/jira/browse/HADOOP-1
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
> multipart enabled; if it is disabled in s3a config it should be disabled here 
> and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18888) S3A. createS3AsyncClient() always enables multipart

2023-09-12 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-1:
---

 Summary: S3A. createS3AsyncClient() always enables multipart
 Key: HADOOP-1
 URL: https://issues.apache.org/jira/browse/HADOOP-1
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran



DefaultS3ClientFactory.createS3AsyncClient() always creates clients with 
multipart enabled; if it is disabled in s3a config it should be disabled here 
and in the transfer manager



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18673) AWS SDK V2 - Refactor getS3Region & other follow up items

2023-09-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18673:

Description: 
* Factor getS3Region into its own ExecutingStoreOperation;
 * Fix issue with getXAttr(/)
 * Look at adding flexible checksum support

  was:
* Factor getS3Region into its own ExecutingStoreOperation;
 * Remove InconsistentS3ClientFactory.
 * Fix issue with getXAttr(/)
 * Look at adding flexible checksum support


> AWS SDK V2 - Refactor getS3Region & other follow up items 
> --
>
> Key: HADOOP-18673
> URL: https://issues.apache.org/jira/browse/HADOOP-18673
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> * Factor getS3Region into its own ExecutingStoreOperation;
>  * Fix issue with getXAttr(/)
>  * Look at adding flexible checksum support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18073) S3A: Upgrade AWS SDK to V2

2023-09-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18073.
-
Fix Version/s: 3.4.0
 Release Note: 
The S3A connector now uses the V2 AWS SDK. 
This is a significant change at the source code level.
Any applications using the internal extension/override points in
the filesystem connector are likely to break.
Consult the document aws_sdk_upgrade for the full details.
   Resolution: Fixed

Merged the feature branch into trunk as one squashed patch; HADOOP-18886 
created for all the outstanding issues

> S3A: Upgrade AWS SDK to V2
> --
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18886) S3A: AWS SDK V2 Migration: stabilization

2023-09-11 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18886:
---

 Summary: S3A: AWS SDK V2 Migration: stabilization
 Key: HADOOP-18886
 URL: https://issues.apache.org/jira/browse/HADOOP-18886
 Project: Hadoop Common
  Issue Type: New Feature
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran
Assignee: Ahmar Suhail


The final stabilisation changes to the V2 SDK MIgration; those moved off the 
HADOOP-18073 JIRA so we can close that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18073) S3A: Upgrade AWS SDK to V2

2023-09-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18073:

Summary: S3A: Upgrade AWS SDK to V2  (was: Upgrade AWS SDK to v2)

> S3A: Upgrade AWS SDK to V2
> --
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18571) AWS SDK V2 - Qualify the upgrade.

2023-09-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18571.
-
Fix Version/s: 3.4.0
   Resolution: Done

> AWS SDK V2 - Qualify the upgrade. 
> --
>
> Key: HADOOP-18571
> URL: https://issues.apache.org/jira/browse/HADOOP-18571
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
> Fix For: 3.4.0
>
>
> Run tests as per [qualifying aws ask 
> update|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md#-qualifying-an-aws-sdk-update]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18818) Merge aws v2 upgrade feature branch into trunk

2023-09-11 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18818.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> Merge aws v2 upgrade feature branch into trunk
> --
>
> Key: HADOOP-18818
> URL: https://issues.apache.org/jira/browse/HADOOP-18818
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> do the merge, with everything we need as a  blocker for that marked as a 
> blocker of this task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18722) Optimise S3A delete objects when multiObjectDelete is disabled

2023-09-11 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763693#comment-17763693
 ] 

Steve Loughran commented on HADOOP-18722:
-

my proposed bulk delete API should work with this parallelisation too

> Optimise S3A delete objects when multiObjectDelete is disabled
> --
>
> Key: HADOOP-18722
> URL: https://issues.apache.org/jira/browse/HADOOP-18722
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Mehakmeet Singh
>Assignee: Mehakmeet Singh
>Priority: Major
>
> Currently, for doing a bulk delete in S3A, we rely on multiObjectDelete call, 
> but when this property is disabled we delete one key at a time. We can 
> optimize this scenario by adding parallelism.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18883) Expect-100 JDK bug resolution: prevent multiple server calls

2023-09-08 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763062#comment-17763062
 ] 

Steve Loughran commented on HADOOP-18883:
-

ok, so what does this mean? there's been ongoing work with 100-continue in 
HADOOP-18865. is this now broken? 

(wonder if that jdk bug is from the same andrew wang who used to work at 
cloudera and is not at Airtable) 

> Expect-100 JDK bug resolution: prevent multiple server calls
> 
>
> Key: HADOOP-18883
> URL: https://issues.apache.org/jira/browse/HADOOP-18883
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Reporter: Pranav Saxena
>Assignee: Pranav Saxena
>Priority: Major
> Fix For: 3.4.0
>
>
> This is inline to JDK bug: [https://bugs.openjdk.org/browse/JDK-8314978].
>  
> With the current implementation of HttpURLConnection if server rejects the 
> “Expect 100-continue” then there will be ‘java.net.ProtocolException’ will be 
> thrown from 'expect100Continue()' method.
> After the exception thrown, If we call any other method on the same instance 
> (ex getHeaderField(), or getHeaderFields()). They will internally call 
> getOuputStream() which invokes writeRequests(), which make the actual server 
> call. 
> In the AbfsHttpOperation, after sendRequest() we call processResponse() 
> method from AbfsRestOperation. Even if the conn.getOutputStream() fails due 
> to expect-100 error, we consume the exception and let the code go ahead. So, 
> we can have getHeaderField() / getHeaderFields() / getHeaderFieldLong() which 
> will be triggered after getOutputStream is failed. These invocation will lead 
> to server calls.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18884) [ABFS] Support VectorIO in ABFS Input Stream

2023-09-07 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18884:
---

 Summary: [ABFS] Support VectorIO in ABFS Input Stream
 Key: HADOOP-18884
 URL: https://issues.apache.org/jira/browse/HADOOP-18884
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: 3.3.9
Reporter: Steve Loughran


the hadoop vector IO APIs are supported in file;// and s3a://; there's a hive 
ORC patch for this and PARQUET-2171 adds it for parquet -after which all apps 
using the library with a matching hadoop version and the feature enabled will 
get a significant speedup.

abfs needs to support too, which needs support for parallel GET requests for 
different ranges



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18447) Vectored IO: Threadpool should be closed on interrupts or during close calls

2023-09-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18447.
-
Resolution: Duplicate

HADOOP-18347 uses bounded pool, so is shutdown in fs.close()

> Vectored IO: Threadpool should be closed on interrupts or during close calls
> 
>
> Key: HADOOP-18447
> URL: https://issues.apache.org/jira/browse/HADOOP-18447
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: common, fs, fs/adl, fs/s3
>Affects Versions: 3.3.5
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance, stability
> Attachments: Screenshot 2022-09-08 at 9.22.07 AM.png
>
>
> Vectored IO threadpool should be closed on any interrupts or during 
> S3AFileSystem/S3AInputStream close() calls.
> E.g Query which got cancelled in the middle of the run. However, in 
> background (e.g LLAP) vectored IO threads continued to run.
>  
> !Screenshot 2022-09-08 at 9.22.07 AM.png|width=537,height=164!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18447) Vectored IO: Threadpool should be closed on interrupts or during close calls

2023-09-07 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18447:

Affects Version/s: 3.3.5

> Vectored IO: Threadpool should be closed on interrupts or during close calls
> 
>
> Key: HADOOP-18447
> URL: https://issues.apache.org/jira/browse/HADOOP-18447
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: common, fs, fs/adl, fs/s3
>Affects Versions: 3.3.5
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance, stability
> Attachments: Screenshot 2022-09-08 at 9.22.07 AM.png
>
>
> Vectored IO threadpool should be closed on any interrupts or during 
> S3AFileSystem/S3AInputStream close() calls.
> E.g Query which got cancelled in the middle of the run. However, in 
> background (e.g LLAP) vectored IO threads continued to run.
>  
> !Screenshot 2022-09-08 at 9.22.07 AM.png|width=537,height=164!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18691) Add a CallerContext getter on the Schedulable interface

2023-09-07 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762791#comment-17762791
 ] 

Steve Loughran commented on HADOOP-18691:
-

thanks, just wanted to cross-link so that the cross proejct dependeies were 
known.

> Add a CallerContext getter on the Schedulable interface
> ---
>
> Key: HADOOP-18691
> URL: https://issues.apache.org/jira/browse/HADOOP-18691
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Christos Bisias
>Assignee: Christos Bisias
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> We would like to add a default *{color:#00875a}CallerContext{color}* getter 
> on the *{color:#00875a}Schedulable{color}* interface
> {code:java}
> default public CallerContext getCallerContext() {
>   return null;  
> } {code}
> and then override it on the 
> *{color:#00875a}i{color}{color:#00875a}{*}pc/{*}Server.Call{color}* class
> {code:java}
> @Override
> public CallerContext getCallerContext() {  
>   return this.callerContext;
> } {code}
> to expose the already existing *{color:#00875a}callerContext{color}* field.
>  
> This change will help us access the *{color:#00875a}CallerContext{color}* on 
> an Apache Ozone *{color:#00875a}IdentityProvider{color}* implementation.
> On Ozone side the *{color:#00875a}FairCallQueue{color}* doesn't work with the 
> Ozone S3G, because all users are masked under a special S3G user and there is 
> no impersonation. Therefore, the FCQ reads only 1 user and becomes 
> ineffective. We can use the *{color:#00875a}CallerContext{color}* field to 
> store the current user and access it on the Ozone 
> {*}{color:#00875a}IdentityProvider{color}{*}.
>  
> This is a presentation with the proposed approach.
> [https://docs.google.com/presentation/d/1iChpCz_qf-LXiPyvotpOGiZ31yEUyxAdU4RhWMKo0c0/edit#slide=id.p]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18879) Recommended Docker config file missing environment variable

2023-09-07 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762743#comment-17762743
 ] 

Steve Loughran commented on HADOOP-18879:
-

you got a PR here?

> Recommended Docker config file missing environment variable
> ---
>
> Key: HADOOP-18879
> URL: https://issues.apache.org/jira/browse/HADOOP-18879
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common, scripts
>Affects Versions: 3.3.6
> Environment: config 
> docker-compose.yaml 
>Reporter: Konstantin Doulepov
>Priority: Major
>
> Docker config is missing {*}HADOOP_HOME=/opt/hadoop{*}, docker environment 
> referencing this variable and docker container cant run examples and behaves 
> erratically
> currently not set in environment and config uses it eg
> MAPRED-SITE.XML_yarn.app.mapreduce.am.env=HADOOP_MAPRED_HOME=$HADOOP_HOME
> but its not set in docker container



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18691) Add a CallerContext getter on the Schedulable interface

2023-09-07 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762740#comment-17762740
 ] 

Steve Loughran commented on HADOOP-18691:
-

[~xBis] could you add a cross reference to the ozone change which needed this? 
thanks

> Add a CallerContext getter on the Schedulable interface
> ---
>
> Key: HADOOP-18691
> URL: https://issues.apache.org/jira/browse/HADOOP-18691
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Christos Bisias
>Assignee: Christos Bisias
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> We would like to add a default *{color:#00875a}CallerContext{color}* getter 
> on the *{color:#00875a}Schedulable{color}* interface
> {code:java}
> default public CallerContext getCallerContext() {
>   return null;  
> } {code}
> and then override it on the 
> *{color:#00875a}i{color}{color:#00875a}{*}pc/{*}Server.Call{color}* class
> {code:java}
> @Override
> public CallerContext getCallerContext() {  
>   return this.callerContext;
> } {code}
> to expose the already existing *{color:#00875a}callerContext{color}* field.
>  
> This change will help us access the *{color:#00875a}CallerContext{color}* on 
> an Apache Ozone *{color:#00875a}IdentityProvider{color}* implementation.
> On Ozone side the *{color:#00875a}FairCallQueue{color}* doesn't work with the 
> Ozone S3G, because all users are masked under a special S3G user and there is 
> no impersonation. Therefore, the FCQ reads only 1 user and becomes 
> ineffective. We can use the *{color:#00875a}CallerContext{color}* field to 
> store the current user and access it on the Ozone 
> {*}{color:#00875a}IdentityProvider{color}{*}.
>  
> This is a presentation with the proposed approach.
> [https://docs.google.com/presentation/d/1iChpCz_qf-LXiPyvotpOGiZ31yEUyxAdU4RhWMKo0c0/edit#slide=id.p]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18876) ABFS: Change default from disk to bytebuffer for fs.azure.data.blocks.buffer

2023-09-04 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761843#comment-17761843
 ] 

Steve Loughran commented on HADOOP-18876:
-

seen more with s3a client, which is what the buffering was taken from 
(HADOOP-13560 and links); then there was HADOOP-17195. 

If we can be confident that even on a system with 64 spark threads then I'll be 
happy, but we need to make sure that max #of queued requests 

This would be a really good time to add support for the IOStatisticsContext 
into the abfs input and output streams: the s3a manifest committers will 
collect some of this HADOOP-17461 , though i've never quite been successful 
wiring it all the way through spark. now that spark is 3.3.5+ we can actually 
do this without playing reflection games. 

> ABFS: Change default from disk to bytebuffer for fs.azure.data.blocks.buffer
> 
>
> Key: HADOOP-18876
> URL: https://issues.apache.org/jira/browse/HADOOP-18876
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.6
>
>
> Change default from disk to bytebuffer for fs.azure.data.blocks.buffer.
> Gathered from multiple workload runs, the presented data underscores a 
> noteworthy enhancement in performance. The adoption of ByteBuffer for 
> *reading operations* exhibited a remarkable improvement of approximately 
> *64.83%* when compared to traditional disk-based reading. Similarly, the 
> implementation of ByteBuffer for *write operations* yielded a substantial 
> efficiency gain of about {*}60.75%{*}. These findings underscore the 
> consistent and substantial advantages of integrating ByteBuffer across a 
> range of workload scenarios.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-09-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Fix Version/s: (was: 3.4.0)

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-09-01 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761354#comment-17761354
 ] 

Steve Loughran commented on HADOOP-18860:
-

yeah, sorry...we both got that wrong. will try harder.

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18878) remove/deprecate fs.s3a.multipart.purge

2023-09-01 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18878:
---

 Summary: remove/deprecate fs.s3a.multipart.purge
 Key: HADOOP-18878
 URL: https://issues.apache.org/jira/browse/HADOOP-18878
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.3.6
Reporter: Steve Loughran


the fs.s3a.multipart.purge option has been in for a long time, to help avoid 
running up costs from incomplete uploads, especially during testing.

but
* adds overhead on startup (list + delete)
* dangerous if one client has a very short lifespan; all active s3a committer 
jobs will be broken
* obsoleted by s3 lifecycle rules
* and "hadoop s3guard uploads" cli.

proposed: deprecate for the next release, delete after.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17943) Add s3a tool to convert S3 server logs to avro/csv files

2023-09-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-17943:
---

Assignee: Mehakmeet Singh

> Add s3a tool to convert S3 server logs to avro/csv files
> 
>
> Key: HADOOP-17943
> URL: https://issues.apache.org/jira/browse/HADOOP-17943
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.2
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>
> Add s3a tool to convert S3 server logs to avro/csv files
> With S3A Auditing, we have code in hadoop-aws to parse s3 log entries, 
> including splitting up the referrer into its fields.
> But we don't have an easy way of using it. I've done some early work in spark 
> but as well as that code not working 
> ([https://github.com/hortonworks-spark/cloud-integration/blob/master/spark-cloud-integration/src/main/scala/com/cloudera/spark/cloud/s3/S3LogRecordParser.scala]),
>  it doesn't do the audit splitting.
>  And, given that the S3 audit logs can be small on a lightly loaded store, 
> not always justified.
> Proposed
> we add
>  # utility parser class to take a row and split it into a record
>  # which can be saved to avro through a schema we define
>  # or exported to CSV with/without headers. (with: easy to understand, 
> without: can cat files)
>  # add a mapper so this can be used in MR jobs (could even make it committer 
> test ..)
>  # and a "hadoop s3guard/hadoop s3" entry point so you can do it on the cli
> {code:java}
> hadoop s3 parselogs -format avro -out s3a://dest/path -recursive 
> s3a://stevel-london/logs/bucket1/*
> {code}
> would take all files under the path, load, parse and emit the output.
> design issues
>  * would you combine all files, or emit a new .avro or .csv file for each one?
>  * what's a good avro schema to cope with new context attributes
>  * CSV nuances: tabs vs spaces, use opencsv or implement the (escaping?) 
> writer ourselves.
>  me: TSV and do a minimal escaping and quoting emitter. Can use opencsv in 
> the test suite.
>  * would you want an initial filter during processing? especially for exit 
> codes?
>  me: no, though I could see the benefit for 503s. Best to let you load it 
> into a notebook or spreadsheet and go from there.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18865) ABFS: Adding 100 continue in userAgent String and dynamically removing it if retry is without the header enabled.

2023-08-31 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18865:

Fix Version/s: 3.4.0
   (was: 3.3.6)

> ABFS: Adding 100 continue in userAgent String and dynamically removing it if 
> retry is without the header enabled.
> -
>
> Key: HADOOP-18865
> URL: https://issues.apache.org/jira/browse/HADOOP-18865
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Adding 100 continue in userAgent String if enabled in AbfsConfiguration and 
> dynamically removing it if retry is without the header enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18328) S3A supports S3 on Outposts

2023-08-31 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18328.
-
Fix Version/s: 3.3.9
   Resolution: Fixed

> S3A supports S3 on Outposts
> ---
>
> Key: HADOOP-18328
> URL: https://issues.apache.org/jira/browse/HADOOP-18328
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Sotetsu Suzugamine
>Assignee: Sotetsu Suzugamine
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, the endpoint for using S3 accesspoint is set as 
> "s3-accesspoint.%s.amazonaws.com" as follows.
> [https://github.com/apache/hadoop/blob/3ec4b932c179d9ec6c4e465f25e35b3d7eded08b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/ArnResource.java#L29]
> However, "s3-outposts.%s.amazonaws.com" is the preferred endpoint when 
> accessing S3 on Outposts bucket by accesspoint.
> This ticket improves them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18073) Upgrade AWS SDK to v2

2023-08-31 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760880#comment-17760880
 ] 

Steve Loughran commented on HADOOP-18073:
-

HBASE-28056 is needed to fix hboss fs test code breakage

> Upgrade AWS SDK to v2
> -
>
> Key: HADOOP-18073
> URL: https://issues.apache.org/jira/browse/HADOOP-18073
> Project: Hadoop Common
>  Issue Type: Task
>  Components: auth, fs/s3
>Affects Versions: 3.3.1
>Reporter: xiaowei sun
>Assignee: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Attachments: Upgrading S3A to SDKV2.pdf
>
>
> This task tracks upgrading Hadoop's AWS connector S3A from AWS SDK for Java 
> V1 to AWS SDK for Java V2.
> Original use case:
> {quote}We would like to access s3 with AWS SSO, which is supported in 
> software.amazon.awssdk:sdk-core:2.*.
> In particular, from 
> [https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html],
>  when to set 'fs.s3a.aws.credentials.provider', it must be 
> "com.amazonaws.auth.AWSCredentialsProvider". We would like to support 
> "software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider" which 
> supports AWS SSO, so users only need to authenticate once.
> {quote}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18752) Change fs.s3a.directory.marker.retention to "keep"

2023-08-31 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18752:

Release Note: The s3a connector no longer deletes directory markers by 
default, which speeds up write operations, reduces iO throttling and saves 
money. this can cause problems with older hadoop releases trying to write to 
the same bucket. (Hadoop 3.3.0; Hadoop 3.2.x before 3.2.2, and all previous 
releases). Set "fs.s3a.directory.marker.retention" to "delete" for backwards 
compatibility  (was: The s3a connector no longer deletes directory markers by 
default; this can cause problems with older hadoop releases trying to write to 
the same bucket. Set fs.s3a.directory.marker.retention to delete for backwards 
compatibility)

> Change fs.s3a.directory.marker.retention to "keep"
> --
>
> Key: HADOOP-18752
> URL: https://issues.apache.org/jira/browse/HADOOP-18752
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Change the default value of "fs.s3a.directory.marker.retention" to keep; 
> update docs to match.
> maybe include with HADOOP-17802 so we don't blow up with fewer markers being 
> created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18856) Spark insertInto with location GCS bucket root not supported

2023-08-30 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760551#comment-17760551
 ] 

Steve Loughran commented on HADOOP-18856:
-

i'm declaring the hadoop part of this done. raising it to critical doesn't make 
it so to the codebase. that's reserved for loss-of-data problems

> Spark insertInto with location GCS bucket root not supported
> 
>
> Key: HADOOP-18856
> URL: https://issues.apache.org/jira/browse/HADOOP-18856
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.3
>Reporter: Dipayan Dev
>Priority: Minor
>
>  
> {noformat}
> scala> import org.apache.hadoop.fs.Path
> import org.apache.hadoop.fs.Path
> scala> val path: Path = new Path("gs://test_dd123/")
> path: org.apache.hadoop.fs.Path = gs://test_dd123/
> scala> path.suffix("/num=123")
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:150)
>   at org.apache.hadoop.fs.Path.(Path.java:129)
>   at org.apache.hadoop.fs.Path.suffix(Path.java:450){noformat}
>  
> Path.suffix throws NPE when writing into GS buckets root. 
>  
> In our Organisation, we are using GCS bucket root location to point to our 
> Hive table. Dataproc's latest 2.1 uses *Hadoop* *3.3.3* and this needs to be 
> fixed in 3.3.3.
> Spark Scala code to reproduce this issue
> {noformat}
> val DF = Seq(("test1", 123)).toDF("name", "num")
> DF.write.option("path", 
> "gs://test_dd123/").mode(SaveMode.Overwrite).partitionBy("num").format("orc").saveAsTable("schema_name.table_name")
> val DF1 = Seq(("test2", 125)).toDF("name", "num")
> DF1.write.mode(SaveMode.Overwrite).format("orc").insertInto("schema_name.table_name")
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:141)
>   at org.apache.hadoop.fs.Path.(Path.java:120)
>   at org.apache.hadoop.fs.Path.suffix(Path.java:441)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:254)
>  {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18856) Spark insertInto with location GCS bucket root not supported

2023-08-30 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18856:

Priority: Minor  (was: Critical)

> Spark insertInto with location GCS bucket root not supported
> 
>
> Key: HADOOP-18856
> URL: https://issues.apache.org/jira/browse/HADOOP-18856
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.3.3
>Reporter: Dipayan Dev
>Priority: Minor
>
>  
> {noformat}
> scala> import org.apache.hadoop.fs.Path
> import org.apache.hadoop.fs.Path
> scala> val path: Path = new Path("gs://test_dd123/")
> path: org.apache.hadoop.fs.Path = gs://test_dd123/
> scala> path.suffix("/num=123")
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:150)
>   at org.apache.hadoop.fs.Path.(Path.java:129)
>   at org.apache.hadoop.fs.Path.suffix(Path.java:450){noformat}
>  
> Path.suffix throws NPE when writing into GS buckets root. 
>  
> In our Organisation, we are using GCS bucket root location to point to our 
> Hive table. Dataproc's latest 2.1 uses *Hadoop* *3.3.3* and this needs to be 
> fixed in 3.3.3.
> Spark Scala code to reproduce this issue
> {noformat}
> val DF = Seq(("test1", 123)).toDF("name", "num")
> DF.write.option("path", 
> "gs://test_dd123/").mode(SaveMode.Overwrite).partitionBy("num").format("orc").saveAsTable("schema_name.table_name")
> val DF1 = Seq(("test2", 125)).toDF("name", "num")
> DF1.write.mode(SaveMode.Overwrite).format("orc").insertInto("schema_name.table_name")
> java.lang.NullPointerException
>   at org.apache.hadoop.fs.Path.(Path.java:141)
>   at org.apache.hadoop.fs.Path.(Path.java:120)
>   at org.apache.hadoop.fs.Path.suffix(Path.java:441)
>   at 
> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.$anonfun$getCustomPartitionLocations$1(InsertIntoHadoopFsRelationCommand.scala:254)
>  {noformat}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18873) ABFS: AbfsOutputStream doesnt close DataBlocks object.

2023-08-30 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760540#comment-17760540
 ] 

Steve Loughran commented on HADOOP-18873:
-

yeah, this seems serious

+[~mehakmeet]

> ABFS: AbfsOutputStream doesnt close DataBlocks object.
> --
>
> Key: HADOOP-18873
> URL: https://issues.apache.org/jira/browse/HADOOP-18873
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.3.4
>Reporter: Pranav Saxena
>Assignee: Pranav Saxena
>Priority: Major
> Fix For: 3.3.4
>
>
> AbfsOutputStream doesnt close the dataBlock object created for the upload.
> What is the implication of not doing that:
> DataBlocks has three implementations:
>  # ByteArrayBlock
>  ## This creates an object of DataBlockByteArrayOutputStream (child of 
> ByteArrayOutputStream: wrapper arround byte-arrray for populating, reading 
> the array.
>  ## This gets GCed.
>  # ByteBufferBlock:
>  ## There is a defined *DirectBufferPool* from which it tries to request the 
> directBuffer.
>  ## If nothing in the pool, a new directBuffer is created.
>  ## the `close` method on the this object has the responsiblity of returning 
> back the buffer to pool so it can be reused.
>  ## Since we are not calling the `close`:
>  ### The pool is rendered of less use, since each request creates a new 
> directBuffer from memory.
>  ### All the object can be GCed and the direct-memory allocated may be 
> returned on the GC. What if the process crashes, the memory never goes back 
> and cause memory issue on the machine.
>  # DiskBlock:
>  ## This creates a file on disk on which the data-to-upload is written. This 
> file gets deleted in startUpload().close().
>  
> startUpload() gives an object of BlockUploadData which gives method of 
> `toByteArray()` which is used in abfsOutputStream to get the byteArray in the 
> dataBlock.
>  
> Method which uses the DataBlock object: 
> https://github.com/apache/hadoop/blob/fac7d26c5d7f791565cc3ab45d079e2cca725f95/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsOutputStream.java#L298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18871) S3ARetryPolicy to use sdk exception retryable() if it is valid

2023-08-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760019#comment-17760019
 ] 

Steve Loughran commented on HADOOP-18871:
-

+review S3AUtils.isMessageTranslatableToEOF() to see if that is still valid

> S3ARetryPolicy to use sdk exception retryable() if it is valid
> --
>
> Key: HADOOP-18871
> URL: https://issues.apache.org/jira/browse/HADOOP-18871
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> S3ARetryPolicy to use sdk exception retryable() if it is appropriate
> An initial {{RetryFromAWSClientIOException}} policy has been written, but not 
> turned on as it there's too much risk of suddenly not retrying where we did 
> today.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18871) S3ARetryPolicy to use sdk exception retryable() if it is valid

2023-08-29 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18871:
---

 Summary: S3ARetryPolicy to use sdk exception retryable() if it is 
valid
 Key: HADOOP-18871
 URL: https://issues.apache.org/jira/browse/HADOOP-18871
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


S3ARetryPolicy to use sdk exception retryable() if it is appropriate

An initial {{RetryFromAWSClientIOException}} policy has been written, but not 
turned on as it there's too much risk of suddenly not retrying where we did 
today.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18862) [JDK17] MiniYarnClusters don't launch in hadoop-aws integration tests

2023-08-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18862:

Parent Issue: HADOOP-17177  (was: HADOOP-16795)

> [JDK17] MiniYarnClusters don't launch in hadoop-aws integration tests
> -
>
> Key: HADOOP-18862
> URL: https://issues.apache.org/jira/browse/HADOOP-18862
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> I've tried running hadoop-aws tests under java17; everything which tries to 
> launch a MiniYarnCluster fails because google guice is trying to stuff in 
> java.land module that is now forbidden
> {code}
> Caused by: java.lang.ExceptionInInitializerError: Exception 
> com.google.inject.internal.cglib.core.$CodeGenerationException: 
> java.lang.reflect.InaccessibleObjectException-->Unable to make protected 
> final java.lang.Class 
> java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
>  throws java.lang.ClassFormatError accessible: module java.base does not 
> "opens java.lang" to unnamed module @7ee7980d [in thread "Thread-109"]
> {code}
> short term fix is to add the params to the surefire and failsafe jvm launcher 
> to allow access
> {code}
> --add-opens java.base/java.lang=ALL-UNNAMED
> {code}
> I don't know if updating guice will make it go away completely. if it doesn't 
> then the history server itself needs to be launched with this
> rather than just add an option for hadoop-aws, we ought to consider a general 
> cross-module variable for junit.jvm.options which is set everywhere; the base 
> impl is "" and a java profile could add the new stuff



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18747) AWS SDK V2 - sigv2 support

2023-08-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760008#comment-17760008
 ] 

Steve Loughran commented on HADOOP-18747:
-

pending/without this, HADOOP-18857 proposes failing better

> AWS SDK V2 - sigv2 support
> --
>
> Key: HADOOP-18747
> URL: https://issues.apache.org/jira/browse/HADOOP-18747
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> AWS SDK V2 does not support sigV2 signing. However, the S3 client supports 
> configurable signers so a custom sigV2 signer can be implemented and 
> configured. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18749) AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure

2023-08-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17760006#comment-17760006
 ] 

Steve Loughran commented on HADOOP-18749:
-

this is from HADOOP-18863; fix that and this test passes

> AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure
> -
>
> Key: HADOOP-18749
> URL: https://issues.apache.org/jira/browse/HADOOP-18749
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.4.0
>
>
> ITestS3AHugeFilesNoMultipart fails with
> java.lang.AssertionError: Expected a 
> org.apache.hadoop.fs.s3a.api.UnsupportedRequestException to be thrown, but 
> got the result: : true
> Happens because the transfer manager currently does not do any MPU when used 
> with the Java async client, so the UnsupportedRequestException never gets 
> thrown. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18749) AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure

2023-08-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18749.
-
Fix Version/s: 3.4.0
   Resolution: Duplicate

> AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure
> -
>
> Key: HADOOP-18749
> URL: https://issues.apache.org/jira/browse/HADOOP-18749
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.4.0
>
>
> ITestS3AHugeFilesNoMultipart fails with
> java.lang.AssertionError: Expected a 
> org.apache.hadoop.fs.s3a.api.UnsupportedRequestException to be thrown, but 
> got the result: : true
> Happens because the transfer manager currently does not do any MPU when used 
> with the Java async client, so the UnsupportedRequestException never gets 
> thrown. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18749) AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure

2023-08-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18749:
---

Assignee: Steve Loughran

> AWS SDK V2 - ITestS3AHugeFilesNoMultipart failure
> -
>
> Key: HADOOP-18749
> URL: https://issues.apache.org/jira/browse/HADOOP-18749
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Steve Loughran
>Priority: Major
>
> ITestS3AHugeFilesNoMultipart fails with
> java.lang.AssertionError: Expected a 
> org.apache.hadoop.fs.s3a.api.UnsupportedRequestException to be thrown, but 
> got the result: : true
> Happens because the transfer manager currently does not do any MPU when used 
> with the Java async client, so the UnsupportedRequestException never gets 
> thrown. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759919#comment-17759919
 ] 

Steve Loughran commented on HADOOP-18860:
-

merged to trunk, will backport. apologies to everyone who is going to have 
problems cherrypicking mockito changes back to older branches, but that goes 
with every mockito update

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Component/s: test

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-29 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Fix Version/s: 3.4.0

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build, test
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-29 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759904#comment-17759904
 ] 

Steve Loughran commented on HADOOP-18797:
-

oh, except one complication. currently the "__magic" path is the one which 
indicates "all writes of files don't get manifest here but have destination 
files created relative to the parent using the dirs under __base to indicate 
where the base dir is"

this'd need to be expanded to support more prefixes, e.g __magic_job_${jobid}

one concern i had when doing the committer was "what if anyone used __magic in 
their code?". nobody has ever complained, but its why we added an off switch. I 
think if someone tried to do a recursive dir copy *or distcp from hdfs to s3* 
bad things would happen, which is why there's that switch to turn it off if 
ever someone complained. to date: nobody


> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-28 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759656#comment-17759656
 ] 

Steve Loughran commented on HADOOP-18797:
-

#3 looks good. 
this would be good simple change for the committers to get the full process of 
getting a pr through, if you want to have a go at it.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18863) AWS SDK V2 - AuditFailureExceptions aren't being translated properly

2023-08-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18863:
---

Assignee: Steve Loughran

> AWS SDK V2 - AuditFailureExceptions aren't being translated properly
> 
>
> Key: HADOOP-18863
> URL: https://issues.apache.org/jira/browse/HADOOP-18863
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> {{ITestS3AHugeFilesNoMultipart}} is failing because the 
> {{AuditFailureException}} variant raised in the sdk handler is being wrapped 
> as it makes its way back to the s3a code -but S3AUtiis.translateException() 
> isn't looking at the inner cause.
> looks like aws v2 sdk class {{.GenericMultipartHelper.handleException}} is 
> wrapping an SdkException with a SdkClientException even though it is not 
> needed.
> we probably have to start looking at the inner cause of any exception during 
> translation to see if that is also a AuditFailureException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18863) AWS SDK V2 - AuditFailureExceptions aren't being translated properly

2023-08-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18863:

Description: 
{{ITestS3AHugeFilesNoMultipart}} is failing because the 
{{AuditFailureException}} variant raised in the sdk handler is being wrapped as 
it makes its way back to the s3a code -but S3AUtiis.translateException() isn't 
looking at the inner cause.

looks like aws v2 sdk class {{.GenericMultipartHelper.handleException}} is 
wrapping an SdkException with a SdkClientException even though it is not needed.

we probably have to start looking at the inner cause of any exception during 
translation to see if that is also a AuditFailureException.

Filed https://github.com/aws/aws-sdk-java-v2/issues/4356

  was:
{{ITestS3AHugeFilesNoMultipart}} is failing because the 
{{AuditFailureException}} variant raised in the sdk handler is being wrapped as 
it makes its way back to the s3a code -but S3AUtiis.translateException() isn't 
looking at the inner cause.

looks like aws v2 sdk class {{.GenericMultipartHelper.handleException}} is 
wrapping an SdkException with a SdkClientException even though it is not needed.

we probably have to start looking at the inner cause of any exception during 
translation to see if that is also a AuditFailureException.


> AWS SDK V2 - AuditFailureExceptions aren't being translated properly
> 
>
> Key: HADOOP-18863
> URL: https://issues.apache.org/jira/browse/HADOOP-18863
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> {{ITestS3AHugeFilesNoMultipart}} is failing because the 
> {{AuditFailureException}} variant raised in the sdk handler is being wrapped 
> as it makes its way back to the s3a code -but S3AUtiis.translateException() 
> isn't looking at the inner cause.
> looks like aws v2 sdk class {{.GenericMultipartHelper.handleException}} is 
> wrapping an SdkException with a SdkClientException even though it is not 
> needed.
> we probably have to start looking at the inner cause of any exception during 
> translation to see if that is also a AuditFailureException.
> Filed https://github.com/aws/aws-sdk-java-v2/issues/4356



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18867) Upgrade ZooKeeper to 3.6.4

2023-08-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18867:

Affects Version/s: 3.3.6

> Upgrade ZooKeeper to 3.6.4
> --
>
> Key: HADOOP-18867
> URL: https://issues.apache.org/jira/browse/HADOOP-18867
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>  Labels: pull-request-available
>
> While ZooKeeper 3.6 is already EOL, we can upgrade to the final release of 
> the ZooKeeper 3.6 as short-term fix until bumping to ZooKeeper 3.7 or later. 
> Dependency convergence error must be addressed on {{-Dhbase.profile=2.0}}.
> {noformat}
> $ mvn clean install -Dzookeeper.version=3.6.4 -Dhbase.profile=2.0 -DskipTests 
> clean install
> Dependency convergence error for 
> org.apache.yetus:audience-annotations:jar:0.13.0:compile paths to dependency 
> are:
> +-org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-common:jar:3.4.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-common:test-jar:tests:3.4.0-SNAPSHOT:test
> +-org.apache.zookeeper:zookeeper:jar:3.6.4:compile
>   +-org.apache.zookeeper:zookeeper-jute:jar:3.6.4:compile
> +-org.apache.yetus:audience-annotations:jar:0.13.0:compile
> and
> +-org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-common:jar:3.4.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-common:test-jar:tests:3.4.0-SNAPSHOT:test
> +-org.apache.zookeeper:zookeeper:jar:3.6.4:compile
>   +-org.apache.yetus:audience-annotations:jar:0.13.0:compile
> and
> +-org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-common:jar:3.4.0-SNAPSHOT
>   +-org.apache.hbase:hbase-common:jar:2.2.4:compile
> +-org.apache.yetus:audience-annotations:jar:0.5.0:compile
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18867) Upgrade ZooKeeper to 3.6.4

2023-08-28 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18867:

Component/s: build

> Upgrade ZooKeeper to 3.6.4
> --
>
> Key: HADOOP-18867
> URL: https://issues.apache.org/jira/browse/HADOOP-18867
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Minor
>  Labels: pull-request-available
>
> While ZooKeeper 3.6 is already EOL, we can upgrade to the final release of 
> the ZooKeeper 3.6 as short-term fix until bumping to ZooKeeper 3.7 or later. 
> Dependency convergence error must be addressed on {{-Dhbase.profile=2.0}}.
> {noformat}
> $ mvn clean install -Dzookeeper.version=3.6.4 -Dhbase.profile=2.0 -DskipTests 
> clean install
> Dependency convergence error for 
> org.apache.yetus:audience-annotations:jar:0.13.0:compile paths to dependency 
> are:
> +-org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-common:jar:3.4.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-common:test-jar:tests:3.4.0-SNAPSHOT:test
> +-org.apache.zookeeper:zookeeper:jar:3.6.4:compile
>   +-org.apache.zookeeper:zookeeper-jute:jar:3.6.4:compile
> +-org.apache.yetus:audience-annotations:jar:0.13.0:compile
> and
> +-org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-common:jar:3.4.0-SNAPSHOT
>   +-org.apache.hadoop:hadoop-common:test-jar:tests:3.4.0-SNAPSHOT:test
> +-org.apache.zookeeper:zookeeper:jar:3.6.4:compile
>   +-org.apache.yetus:audience-annotations:jar:0.13.0:compile
> and
> +-org.apache.hadoop:hadoop-yarn-server-timelineservice-hbase-common:jar:3.4.0-SNAPSHOT
>   +-org.apache.hbase:hbase-common:jar:2.2.4:compile
> +-org.apache.yetus:audience-annotations:jar:0.5.0:compile
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-27 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759383#comment-17759383
 ] 

Steve Loughran commented on HADOOP-18842:
-


Update: no, trying to be dynamic is just over complicated.

* always marshall the commit info to Writable.
* default in memory saves in ByteArrayOutputStream
* to disk saves to a file output stream.

The decision to use disk is made by a config option, and would only need 
enabling if scale problems were encountered. Use of the same marshalled format 
in both forms of storage ensures consistent code coverage, gives us efficient 
storage.


> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-27 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759382#comment-17759382
 ] 

Steve Loughran commented on HADOOP-18797:
-

+[~srahman] -thoughts?

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18679) Add API for bulk/paged object deletion

2023-08-26 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759269#comment-17759269
 ] 

Steve Loughran commented on HADOOP-18679:
-

Done a first pass at an API in a PR: no actual time allocated to implement it 
-others very welcome to!

Minimal API of  (basepath, RemoteIterator) for enumerating files; caller 
gets to implement the iterator of their choice.

progress report callbacks allow for the operation to be aborted.

final outcome report lists files not deleted (would that scale? I've left out 
the list of deleted files for that reason), exception to raise, some numbers 
and any IOStats to return. 

https://github.com/steveloughran/hadoop/blob/s3/HADOOP-18679-bulk-delete-api/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BulkDelete.java

> Add API for bulk/paged object deletion
> --
>
> Key: HADOOP-18679
> URL: https://issues.apache.org/jira/browse/HADOOP-18679
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> iceberg and hbase could benefit from being able to give a list of individual 
> files to delete -files which may be scattered round the bucket for better 
> read peformance. 
> Add some new optional interface for an object store which allows a caller to 
> submit a list of paths to files to delete, where
> the expectation is
> * if a path is a file: delete
> * if a path is a dir, outcome undefined
> For s3 that'd let us build these into DeleteRequest objects, and submit, 
> without any probes first.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-25 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759123#comment-17759123
 ] 

Steve Loughran commented on HADOOP-18842:
-

ok, so you are proposing we split the output files by dest directory, for 
parallelised reading and better scale there/

good
* you can switch from memory storage to disk storage once some threshold is 
reached.
* many readers can read files independently
* if a job commit fails, more partitions are likely to be preserved or updated
* bad: lots of files to create and open
* bad: complexit when reading in the manifest of a task to determine which file 
to update.

I suppose a tactic would be to generate a map of (dir -> accumulator), and the 
accumulator is updated with the list of files from that TA. if the accumulator 
gets above a certain size, then the switch to saving to files kicks in. You 
could probably avoid the need for the cross-thread queue /async record write by 
just having whichever thread is trying to update the accumulator acquire a lock 
to it, then do the create (if needed), plus the record writes. 

Another thing to consider is: how efficient is the current SinglePendingCommit 
structure; we do use the file format as the record format, don't we? a more 
efficient design for any accumulator would be possible, wouldn't it? something 
of (path, uploadID, array[part-info]). 

in the manifest committer I hadn't worried about the preservation of dirs until 
commit; having a single file listing all commits was just a way to avoid 
running OOM and rely on file buffering/caching to keep cost of building the 
file low.

we did hit memory problems without it though. the big issue is on a spark 
driver with many active jobs: the memory requirement of multiple job commits 
going on at the same time was causing oom failures not seen with the older 
committer, even though the entry size for each file to commit was much smaller 
(src, dest path, etag).

> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-25 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18797:

Affects Version/s: 3.3.6

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18797) S3A committer fix lost data on concurrent jobs

2023-08-25 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759118#comment-17759118
 ] 

Steve Loughran commented on HADOOP-18797:
-

bq. unning multiple jobs writing into the same dir is always pretty risky

if they are generating new files with uuids in their names, and you want all 
jobs to add to the existing dataset, should be safe.

> S3A committer fix lost data on concurrent jobs
> --
>
> Key: HADOOP-18797
> URL: https://issues.apache.org/jira/browse/HADOOP-18797
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Reporter: Emanuel Velzi
>Priority: Major
>
> There is a failure in the commit process when multiple jobs are writing to a 
> s3 directory *concurrently* using {*}magic committers{*}.
> This issue is closely related HADOOP-17318.
> When multiple Spark jobs write to the same S3A directory, they upload files 
> simultaneously using "__magic" as the base directory for staging. Inside this 
> directory, there are multiple "/job-some-uuid" directories, each representing 
> a concurrently running job.
> To fix some preoblems related to concunrrency a property was introduced in 
> the previous fix: "spark.hadoop.fs.s3a.committer.abort.pending.uploads". When 
> set to false, it ensures that during the cleanup stage, finalizing jobs do 
> not abort pending uploads from other jobs. So we see in logs this line: 
> {code:java}
> DEBUG [main] o.a.h.fs.s3a.commit.AbstractS3ACommitter (819): Not cleanup up 
> pending uploads to s3a ...{code}
> (from 
> [AbstractS3ACommitter.java#L952|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L952])
> However, in the next step, the {*}"__magic" directory is recursively 
> deleted{*}:
> {code:java}
> INFO  [main] o.a.h.fs.s3a.commit.magic.MagicS3GuardCommitter (98): Deleting 
> magic directory s3a://my-bucket/my-table/__magic: duration 0:00.560s {code}
> (from [AbstractS3ACommitter.java#L1112 
> |https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/AbstractS3ACommitter.java#L1112]and
>  
> [MagicS3GuardCommitter.java#L137)|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L137)]
> This deletion operation *affects the second job* that is still running 
> because it loses pending uploads (i.e., ".pendingset" and ".pending" files).
> The consequences can range from an exception in the best case to a silent 
> loss of data in the worst case. The latter occurs when Job_1 deletes files 
> just before Job_2 executes "listPendingUploadsToCommit" to list ".pendingset" 
> files in the job attempt directory previous to complete the uploads with POST 
> requests.
> To resolve this issue, it's important {*}to ensure that only the prefix 
> associated with the job currently finalizing is cleaned{*}.
> Here's a possible solution:
> {code:java}
> /**
>  * Delete the magic directory.
>  */
> public void cleanupStagingDirs() {
>   final Path out = getOutputPath();
>  //Path path = magicSubdir(getOutputPath());
>   Path path = new Path(magicSubdir(out), formatJobDir(getUUID()));
>   try(DurationInfo ignored = new DurationInfo(LOG, true,
>   "Deleting magic directory %s", path)) {
> Invoker.ignoreIOExceptions(LOG, "cleanup magic directory", 
> path.toString(),
> () -> deleteWithWarning(getDestFS(), path, true));
>   }
> } {code}
>  
> The side effect of this issue is that the "__magic" directory is never 
> cleaned up. However, I believe this is a minor concern, even considering that 
> other folders such as "_SUCCESS" also persist after jobs end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18866) Refactor @Test(expected) with assertThrows

2023-08-25 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17759117#comment-17759117
 ] 

Steve Loughran commented on HADOOP-18866:
-

because they are existing tests, they find regressions and rewriting test code 
just because that the existing style is out of fashion is hard to justify.

why bother? doesn't improve test coverage or diagnostics; get it wrong and 
either you have a false positive (test failure) or false negative (misses 
regressions). it is stable code.

# new tests, no; there'd we want intercept() and assertj. using assertj over 
junit5 asserts helps us to backport things to older branches without reworking 
the tests.
# as part of ongoing changes to existing tests -yes. 
# a bulk replace of (expected = to intercept()? well, we are always scared of 
big changes. look at the commit history of moving to junit5

> Refactor @Test(expected) with assertThrows
> --
>
> Key: HADOOP-18866
> URL: https://issues.apache.org/jira/browse/HADOOP-18866
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Taher Ghaleb
>Priority: Minor
>  Labels: pull-request-available
>
> I am working on research that investigates test smell refactoring in which we 
> identify alternative implementations of test cases, study how commonly used 
> these refactorings are, and assess how acceptable they are in practice.
> The smell occurs when exception handling can alternatively be implemented 
> using assertion rather than annotation: using {{assertThrows(Exception.class, 
> () -> \{...});}} instead of {{{}@Test(expected = Exception.class){}}}.
> While there are many cases like this, we aim in this pull request to get your 
> feedback on this particular test smell and its refactoring. Thanks in advance 
> for your input.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-25 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18842:
---

Assignee: Syed Shameerur Rahman

> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18842) Support Overwrite Directory On Commit For S3A Committers

2023-08-25 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18842:

Parent: HADOOP-18477
Issue Type: Sub-task  (was: New Feature)

> Support Overwrite Directory On Commit For S3A Committers
> 
>
> Key: HADOOP-18842
> URL: https://issues.apache.org/jira/browse/HADOOP-18842
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to add a new kind of commit mechanism in which the destination 
> directory is cleared off before committing the file.
> *Use Case*
> In case of dynamicPartition insert overwrite queries, The destination 
> directory which needs to be overwritten are not known before the execution 
> and hence it becomes a challenge to clear off the destination directory.
>  
> One approach to handle this is, The underlying engines/client will clear off 
> all the destination directories before calling the commitJob operation but 
> the issue with this approach is that, In case of failures while committing 
> the files, We might end up with the whole of previous data being deleted 
> making the recovery process difficult or time consuming.
>  
> *Solution*
> Based on mode of commit operation either *INSERT* or *OVERWRITE* , During 
> commitJob operations, The committer will map each destination directory with 
> the commits which needs to be added in the directory and if the mode is 
> *OVERWRITE* , The committer will delete the directory recursively and then 
> commit each of the files in the directory. So in case of failures (worst 
> case) The number of destination directory which will be deleted will be equal 
> to the number of threads if we do it in multi-threaded way as compared to the 
> whole data if it was done in the engine side.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18818) Merge aws v2 upgrade feature branch into trunk

2023-08-24 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18818:
---

Assignee: Steve Loughran

> Merge aws v2 upgrade feature branch into trunk
> --
>
> Key: HADOOP-18818
> URL: https://issues.apache.org/jira/browse/HADOOP-18818
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>
> do the merge, with everything we need as a  blocker for that marked as a 
> blocker of this task.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18742) AWS v2 SDK: stabilise dependencies with rest of hadoop libraries

2023-08-24 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18742.
-
Resolution: Not A Problem

> AWS v2 SDK: stabilise dependencies with rest of hadoop libraries
> 
>
> Key: HADOOP-18742
> URL: https://issues.apache.org/jira/browse/HADOOP-18742
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
>  Labels: pull-request-available
>
>  aws v2 sdk dependencies need to 
> # be in sync with rest of hadoop
> # not include anything coming in from hadoop common
> # not export in hadoop-cloud-storage stuff from hadoop-common.
> currently it is pulling in a version of jackson cbor whose consistency with 
> hadoop's import is simply luck, joda time is also there
> In an ideal world all this should be shaded: we cannot have the AWS sdk 
> dictate what jackson version we ship with, given the history of downstream 
> problems there.
> {code}
> [INFO] +- com.amazonaws:aws-java-sdk-core:jar:1.12.316:compile
> [INFO] |  +- software.amazon.ion:ion-java:jar:1.0.2:compile
> [INFO] |  +- 
> com.fasterxml.jackson.dataformat:jackson-dataformat-cbor:jar:2.12.7:compile
> [INFO] |  \- joda-time:joda-time:jar:2.8.1:compile
> [INFO] +- software.amazon.awssdk:bundle:jar:2.19.12:compile
> [INFO] |  \- software.amazon.eventstream:eventstream:jar:1.0.1:compile
> [INFO] +- software.amazon.awssdk.crt:aws-crt:jar:0.21.0:compile
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16842) abfs can't access storage account if soft delete is enabled

2023-08-24 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758456#comment-17758456
 ] 

Steve Loughran commented on HADOOP-16842:
-

you are asking questions about azure storage itself; the abfs client is simply 
the client to it. we only know what is announced. some of the developers (not 
me) work at microsoft and have a better idea of what's coming -but even there 
they are unlikely to preannounce anything.



> abfs can't access storage account if soft delete is enabled
> ---
>
> Key: HADOOP-16842
> URL: https://issues.apache.org/jira/browse/HADOOP-16842
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.1.3
>Reporter: Raghvendra Singh
>Priority: Minor
>  Labels: abfsactive
>
> Facing the issue in which if soft delete is enabled on storage account.
> Hadoop fs -ls command fails with 
> {noformat}
>  Operation failed: "This endpoint does not support BlobStorageEvents or 
> SoftDelete. Please disable these account features if you would like to use 
> this endpoint.", 409, HEAD, 
> https://.[dfs.core.windows.net/test-container-1//?upn=false=getAccessControl=90|http://dfs.core.windows.net/test-container-1//?upn=false=getAccessControl=90]
> {noformat}
> Trying to access storage account by issuing below command :
> {noformat}
>  hadoop fs 
> -Dfs.azure.account.auth.type..[dfs.core.windows.net|http://dfs.core.windows.net/]=OAuth
>  
> -Dfs.azure.account.oauth.provider.type..[dfs.core.windows.net|http://dfs.core.windows.net/]=org.apache.hadoop.fs.azurebfs.oauth2.MsiTokenProvider
>  -ls 
> [abfs://test-container-1]@.[dfs.core.windows.net/|http://dfs.core.windows.net/]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758168#comment-17758168
 ] 

Steve Loughran commented on HADOOP-18860:
-

this is needed for HADOOP-17377; the initial PR there tried to upgrade mockito 
in hadoop-azure only, but this does it everywhere as trying to mix dependency 
versions, even test ones, is doomed. and it is probably time for a mockito 
upgrade anyway. lovely how method names have changed *for no obvious reason at 
all*

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Issue Type: Improvement  (was: Task)

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Affects Version/s: 3.3.6

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Minor
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Component/s: build

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Task
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Minor
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18860) Upgrade mockito to 4.11.0

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18860:

Priority: Major  (was: Minor)

> Upgrade mockito to 4.11.0
> -
>
> Key: HADOOP-18860
> URL: https://issues.apache.org/jira/browse/HADOOP-18860
> Project: Hadoop Common
>  Issue Type: Task
>  Components: build
>Affects Versions: 3.3.6
>Reporter: Anmol Asrani
>Assignee: Anmol Asrani
>Priority: Major
>  Labels: pull-request-available
>
> Upgrading mockito in hadoop-project



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18863) AWS SDK V2 - AuditFailureExceptions aren't being translated properly

2023-08-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758094#comment-17758094
 ] 

Steve Loughran commented on HADOOP-18863:
-

run with args -Dtest=moo -Dprefetch -Dscale 
-Dit.test=ITestS3AHugeFilesNoMultipart
{code}
[ERROR] 
test_030_postCreationAssertions(org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesNoMultipart)
  Time elapsed: 139.8 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSClientIOException: 
copyFile(tests3ascale/disk/hugefile, tests3ascale/disk/hugefileRenamed) on 
tests3ascale/disk/hugefile: 
software.amazon.awssdk.core.exception.SdkClientException: Failed to initiate 
multipart upload: Failed to initiate multipart upload
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:194)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.copyFile(S3AFileSystem.java:4439)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.access$2300(S3AFileSystem.java:283)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem$OperationCallbacksImpl.copyFile(S3AFileSystem.java:2432)
at 
org.apache.hadoop.fs.s3a.impl.RenameOperation.copySource(RenameOperation.java:561)
at 
org.apache.hadoop.fs.s3a.impl.RenameOperation.renameFileToDest(RenameOperation.java:312)
at 
org.apache.hadoop.fs.s3a.impl.RenameOperation.execute(RenameOperation.java:266)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerRename(S3AFileSystem.java:2351)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$rename$8(S3AFileSystem.java:2202)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2623)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.rename(S3AFileSystem.java:2200)
at 
org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesNoMultipart.lambda$test_030_postCreationAssertions$0(ITestS3AHugeFilesNoMultipart.java:108)
at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:498)
at 
org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:384)
at 
org.apache.hadoop.fs.s3a.scale.ITestS3AHugeFilesNoMultipart.test_030_postCreationAssertions(ITestS3AHugeFilesNoMultipart.java:107)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:568)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Failed to 
initiate multipart upload
at 
software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
at 
software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
at 
software.amazon.awssdk.services.s3.internal.multipart.GenericMultipartHelper.handleException(GenericMultipartHelper.java:65)
at 
software.amazon.awssdk.services.s3.internal.multipart.CopyObjectHelper.lambda$copyInParts$6(CopyObjectHelper.java:115)
at 

[jira] [Resolved] (HADOOP-18853) AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18853.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

> AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy
> ---
>
> Key: HADOOP-18853
> URL: https://issues.apache.org/jira/browse/HADOOP-18853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> With 2.20.121, the TM has MPU functionality. Upgrading to the latest version 
> (2.20.28) will also solve the issue with needing to include the CRT 
> dependency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18859) AWS SDK v2 typo in aws.evenstream.version

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18859.
-
Fix Version/s: 3.4.0
 Assignee: Ahmar Suhail
   Resolution: Duplicate

> AWS SDK v2 typo in aws.evenstream.version
> -
>
> Key: HADOOP-18859
> URL: https://issues.apache.org/jira/browse/HADOOP-18859
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Ahmar Suhail
>Priority: Trivial
> Fix For: 3.4.0
>
>
> the pom version property aws.evenstream.version should be 
> aws.eventstream.version



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18853) AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy

2023-08-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17758093#comment-17758093
 ] 

Steve Loughran commented on HADOOP-18853:
-

scale tests show that the multipart copy exception wrapping isn't the same as 
before: HADOOP-18863

Not blocking me merging to trunk,  but it is a problem. and a reminder: do test 
with -Dscale.


> AWS SDK V2 - Upgrade SDK to 2.20.28 and restores multipart copy
> ---
>
> Key: HADOOP-18853
> URL: https://issues.apache.org/jira/browse/HADOOP-18853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>  Labels: pull-request-available
>
> With 2.20.121, the TM has MPU functionality. Upgrading to the latest version 
> (2.20.28) will also solve the issue with needing to include the CRT 
> dependency. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18863) AWS SDK V2 - AuditFailureExceptions aren't being translated properly

2023-08-23 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18863:
---

 Summary: AWS SDK V2 - AuditFailureExceptions aren't being 
translated properly
 Key: HADOOP-18863
 URL: https://issues.apache.org/jira/browse/HADOOP-18863
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


{{ITestS3AHugeFilesNoMultipart}} is failing because the 
{{AuditFailureException}} variant raised in the sdk handler is being wrapped as 
it makes its way back to the s3a code -but S3AUtiis.translateException() isn't 
looking at the inner cause.

looks like aws v2 sdk class {{.GenericMultipartHelper.handleException}} is 
wrapping an SdkException with a SdkClientException even though it is not needed.

we probably have to start looking at the inner cause of any exception during 
translation to see if that is also a AuditFailureException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18328) S3A supports S3 on Outposts

2023-08-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757964#comment-17757964
 ] 

Steve Loughran commented on HADOOP-18328:
-

code fix is in trunk; once the doc pr is in i will cherrypick both as a single 
patch into 3.3

> S3A supports S3 on Outposts
> ---
>
> Key: HADOOP-18328
> URL: https://issues.apache.org/jira/browse/HADOOP-18328
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Sotetsu Suzugamine
>Assignee: Sotetsu Suzugamine
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, the endpoint for using S3 accesspoint is set as 
> "s3-accesspoint.%s.amazonaws.com" as follows.
> [https://github.com/apache/hadoop/blob/3ec4b932c179d9ec6c4e465f25e35b3d7eded08b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/ArnResource.java#L29]
> However, "s3-outposts.%s.amazonaws.com" is the preferred endpoint when 
> accessing S3 on Outposts bucket by accesspoint.
> This ticket improves them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18328) S3A supports S3 on Outposts

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18328:

Fix Version/s: 3.4.0

> S3A supports S3 on Outposts
> ---
>
> Key: HADOOP-18328
> URL: https://issues.apache.org/jira/browse/HADOOP-18328
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Sotetsu Suzugamine
>Assignee: Sotetsu Suzugamine
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, the endpoint for using S3 accesspoint is set as 
> "s3-accesspoint.%s.amazonaws.com" as follows.
> [https://github.com/apache/hadoop/blob/3ec4b932c179d9ec6c4e465f25e35b3d7eded08b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/ArnResource.java#L29]
> However, "s3-outposts.%s.amazonaws.com" is the preferred endpoint when 
> accessing S3 on Outposts bucket by accesspoint.
> This ticket improves them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18328) S3A supports S3 on Outposts

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HADOOP-18328:
---

Assignee: Sotetsu Suzugamine

> S3A supports S3 on Outposts
> ---
>
> Key: HADOOP-18328
> URL: https://issues.apache.org/jira/browse/HADOOP-18328
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Reporter: Sotetsu Suzugamine
>Assignee: Sotetsu Suzugamine
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, the endpoint for using S3 accesspoint is set as 
> "s3-accesspoint.%s.amazonaws.com" as follows.
> [https://github.com/apache/hadoop/blob/3ec4b932c179d9ec6c4e465f25e35b3d7eded08b/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/ArnResource.java#L29]
> However, "s3-outposts.%s.amazonaws.com" is the preferred endpoint when 
> accessing S3 on Outposts bucket by accesspoint.
> This ticket improves them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14269) [JDK11] Create module-info.java for each module

2023-08-23 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757959#comment-17757959
 ] 

Steve Loughran commented on HADOOP-14269:
-

going to add that multiversion jars are inevitably a nightmare for fielding 
support calls related to stack traces: you really don't want do them

> [JDK11] Create module-info.java for each module
> ---
>
> Key: HADOOP-14269
> URL: https://issues.apache.org/jira/browse/HADOOP-14269
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Akira Ajisaka
>Priority: Major
>
> module-info.java is required for JDK9 Jigsaw feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14269) [JDK11] Create module-info.java for each module

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-14269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-14269:

Summary: [JDK11] Create module-info.java for each module  (was: Create 
module-info.java for each module)

> [JDK11] Create module-info.java for each module
> ---
>
> Key: HADOOP-14269
> URL: https://issues.apache.org/jira/browse/HADOOP-14269
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Akira Ajisaka
>Priority: Major
>
> module-info.java is required for JDK9 Jigsaw feature.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18862) [JDK17] MiniYarnClusters don't launch in hadoop-aws integration tests

2023-08-23 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18862:
---

 Summary: [JDK17] MiniYarnClusters don't launch in hadoop-aws 
integration tests
 Key: HADOOP-18862
 URL: https://issues.apache.org/jira/browse/HADOOP-18862
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


I've tried running hadoop-aws tests under java17; everything which tries to 
launch a MiniYarnCluster fails because google guice is trying to stuff in 
java.land module that is now forbidden

{code}
Caused by: java.lang.ExceptionInInitializerError: Exception 
com.google.inject.internal.cglib.core.$CodeGenerationException: 
java.lang.reflect.InaccessibleObjectException-->Unable to make protected final 
java.lang.Class 
java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int,java.security.ProtectionDomain)
 throws java.lang.ClassFormatError accessible: module java.base does not "opens 
java.lang" to unnamed module @7ee7980d [in thread "Thread-109"]

{code}

short term fix is to add the params to the surefire and failsafe jvm launcher 
to allow access

{code}
--add-opens java.base/java.lang=ALL-UNNAMED

{code}

I don't know if updating guice will make it go away completely. if it doesn't 
then the history server itself needs to be launched with this

rather than just add an option for hadoop-aws, we ought to consider a general 
cross-module variable for junit.jvm.options which is set everywhere; the base 
impl is "" and a java profile could add the new stuff



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18741) AWS SDK v2 code tuning

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18741.
-
  Assignee: Steve Loughran
Resolution: Duplicate

> AWS SDK v2  code tuning
> ---
>
> Key: HADOOP-18741
> URL: https://issues.apache.org/jira/browse/HADOOP-18741
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> tuning of the v2 sdk code prior to merge;
> {code}
> * 
> hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/AwsCredentialListProvider.java
> L184 access denied exception. add test for this?
> AWSClientConfig
> TODO: Don't think you can set a socket factory for the netty client.
> cloudstore: add the new paths
> import software.amazon.awssdk.http.apache.ApacheHttpClient;
> import 
> software.amazon.awssdk.thirdparty.org.apache.http.conn.ssl.SSLConnectionSocketFactory;
>   
> oftware.amazon.awssdk.services.s3.model.HeadBucketResponse;
> hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/HeaderProcessing.java
> +add test for getHeaders(/) to see what comes back
> hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ABucketExistence.java
> L128 use explicit region constant rather than inline string
> hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AConfiguration.java
> L552: use intercept()
> hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java
> L75: just throw the exception again
> L87, L90, use constants
> hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AAWSCredentialsProvider.java
> L44 move o.a.h. imports into "real" hadoop block; include the sets one too
> hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AProxy.java
> is new ssl.proxy  setting consistent with what this pr does
> hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/auth/delegation/ITestSessionDelegationInFileystem.java
> L335 TODO open, getObjectMetadata("/")
> +cut 
> hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/InconsistentS3ClientFactory.java
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18819) AWS SDK v2 build complaints

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18819.
-
  Assignee: Steve Loughran
Resolution: Duplicate

> AWS SDK v2 build complaints
> ---
>
> Key: HADOOP-18819
> URL: https://issues.apache.org/jira/browse/HADOOP-18819
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> rebase branches hightlight spotbugs and javadoc issues,  plus style.
> nothing major but should be addressed before the merge, especially the 
> spotbugs one
> {code}
> hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/ProgressableProgressListener.java:80:
>  warning: no @param for upload
> {code}
> and something that needs review, probably a spotbugs disable if we are happy 
> its a false alarm
> {code}
> Code  Warning
> ISInconsistent synchronization of 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient; locked 60% of time
> Bug type IS2_INCONSISTENT_SYNC (click for details)
> In class org.apache.hadoop.fs.s3a.S3AFileSystem
> Field org.apache.hadoop.fs.s3a.S3AFileSystem.s3AsyncClient
> Synchronized 60% of the time
> Unsynchronized access at S3AFileSystem.java:[line 1764]
> Unsynchronized access at S3AFileSystem.java:[line 989]
> Synchronized access at S3AFileSystem.java:[line 4179]
> Synchronized access at S3AFileSystem.java:[line 4184]
> Synchronized access at S3AFileSystem.java:[line 1002]
> {code}
> {code}
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/impl/ActiveAuditManagerS3A.java:413:
> //  
> https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/core/interceptor/ExecutionInterceptor.html:
>  Line is longer than 100 characters (found 115). [LineLength]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java:128:
>   private , ClientT> 
> BuilderT configureClientBuilder(: Line is longer than 100 characters (found 
> 109). [LineLength]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:24:public
>  interface AWSHeaders {: interfaces should describe a type and hence have 
> methods. [InterfaceIsType]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:46:
>   /** S3's version ID header */: First sentence should end with a period. 
> [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:49:
>   /** Header describing what class of storage a user wants */: First sentence 
> should end with a period. [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:52:
>   /** Header describing what archive tier the object is in, if any */: First 
> sentence should end with a period. [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:55:
>   /** Header for optional server-side encryption algorithm */: First sentence 
> should end with a period. [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:58:
>   /** Range header for the get object request */: First sentence should end 
> with a period. [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:68:
>   /** JSON-encoded description of encryption materials used during encryption 
> */: First sentence should end with a period. [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSHeaders.java:71:
>   /** Header for the optional restore information of an object */: First 
> sentence should end with a period. [JavadocStyle]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/InconsistentS3ClientFactory.java:68:
>FailureInjectionInterceptor(FailureInjectionPolicy policy) {: 'ctor def 
> modifier' has incorrect indentation level 3, expected level should be 4. 
> [Indentation]
> ./hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/select/BlockingEnumeration.java:57:
>   private final Signal END_SIGNAL = new Signal<>((Throwable)null);:27: 
> Name 'END_SIGNAL' must match pattern '^[a-z][a-zA-Z0-9]*$'. [MemberName]
> ./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/AbstractS3AMockTest.java:57:
>   protected S3Client s3;:22: Variable 's3' must be private and have accessor 
> methods. [VisibilityModifier]
> ./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/AbstractAuditingTest.java:28:import
>  java.util.function.Consumer;:8: Unused import - java.util.function.Consumer. 
> [UnusedImports]
> ./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/select/StreamPublisher.java:38:
>   public 

[jira] [Resolved] (HADOOP-18812) list AWS SDK v2 libraries in LICENSE-binary

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18812.
-
Fix Version/s: 3.4.0
   Resolution: Duplicate

> list AWS SDK v2 libraries in LICENSE-binary
> ---
>
> Key: HADOOP-18812
> URL: https://issues.apache.org/jira/browse/HADOOP-18812
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Blocker
> Fix For: 3.4.0
>
>
> LICENSE.binary needs to be updated to list all new jars, and remove all that 
> are gone



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18820) AWS SDK v2: make the v1 bridging support optional

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-18820:

Hadoop Flags: Incompatible change

> AWS SDK v2: make the v1 bridging support optional
> -
>
> Key: HADOOP-18820
> URL: https://issues.apache.org/jira/browse/HADOOP-18820
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> The AWS SDK v2 code includes the v1 sdk core for plugin support of
> * existing credential providers
> * delegation token binding
> I propose we break #2 and rely on those who have implemented to to upgrade. 
> apart from all the needless changes the v2 SDK did to the api (why?) this is 
> fairly straighforward
> for #1: fix through reflection, retaining a v1 sdk dependency at test time so 
> we can verify that the binder works. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18820) AWS SDK v2: make the v1 bridging support optional

2023-08-23 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18820.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

merged to both feature branches

> AWS SDK v2: make the v1 bridging support optional
> -
>
> Key: HADOOP-18820
> URL: https://issues.apache.org/jira/browse/HADOOP-18820
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> The AWS SDK v2 code includes the v1 sdk core for plugin support of
> * existing credential providers
> * delegation token binding
> I propose we break #2 and rely on those who have implemented to to upgrade. 
> apart from all the needless changes the v2 SDK did to the api (why?) this is 
> fairly straighforward
> for #1: fix through reflection, retaining a v1 sdk dependency at test time so 
> we can verify that the binder works. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18859) AWS SDK v2 typo in aws.evenstream.version

2023-08-22 Thread Steve Loughran (Jira)
Steve Loughran created HADOOP-18859:
---

 Summary: AWS SDK v2 typo in aws.evenstream.version
 Key: HADOOP-18859
 URL: https://issues.apache.org/jira/browse/HADOOP-18859
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Steve Loughran


the pom version property aws.evenstream.version should be 
aws.eventstream.version



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO

2023-08-22 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757319#comment-17757319
 ] 

Steve Loughran commented on HADOOP-18852:
-

my unbuffer pr will pass down some of this, and the split start/end. we 
shouldn't bother prefetching past the end of a file split, should we?

> S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look 
> like random IO
> --
>
> Key: HADOOP-18852
> URL: https://issues.apache.org/jira/browse/HADOOP-18852
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with 
> separately.
> # all seeks are lazy; no fetching is kicked off after an open
> # the first read is treated as an out of order read, so cancels any active 
> reads (don't think there are any) and then only asks for 1 block
> {code}
> if (outOfOrderRead) {
>   LOG.debug("lazy-seek({})", getOffsetStr(readPos));
>   blockManager.cancelPrefetches();
>   // We prefetch only 1 block immediately after a seek operation.
>   prefetchCount = 1;
> }
> {code}
> * for any read fully we should prefetch all blocks in the range requested
> * for other reads, we may want a bigger prefech count than 1, depending on: 
> split start/end, file read policy (random, sequential, whole-file)
> * also, if a read is in a block other than the current one, but which is 
> already being fetched or cached, is this really an OOO read to the extent 
> that outstanding fetches should be cancelled?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18858) Upgrade guava due to CVE

2023-08-22 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HADOOP-18858.
-
Resolution: Duplicate

duplicate of HADOOP-18843. 

[~andrewkstory] anything you can do to help get that PR in would be wonderful. 

> Upgrade guava due to CVE
> 
>
> Key: HADOOP-18858
> URL: https://issues.apache.org/jira/browse/HADOOP-18858
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hadoop-thirdparty
>Affects Versions: 3.3.6
>Reporter: Andrew Story
>Priority: Major
>
> Update guava to 32.0.1 or higher due to CVE: 
> [https://nvd.nist.gov/vuln/detail/CVE-2023-2976]
> hadoop-shaded-guava 1.1.1 is currently using 30.1.1-jre per security scanning 
> tools



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18839) s3a client SSLException is raised after very long timeout "Unsupported or unrecognized SSL message"

2023-08-22 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757302#comment-17757302
 ] 

Steve Loughran commented on HADOOP-18839:
-

also seen with a different third party store.

hypothesis: s3a retry policy retries on unrecoverable ssl handshake problems. 
these don't surface against AWS S3 (except maybe with proxy problems) which is 
why its rare. but it should be tractable.

{code}
22/05/25 10:53:53 DEBUG conn.ClientConnectionManagerFactory:
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy9.connect(Unknown Source)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at 
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1343)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5453)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5400)
at 
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1372)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getObjectMetadata$6(S3AFileSystem.java:2053)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:412)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:375)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:2043)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getObjectMetadata(S3AFileSystem.java:2019)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3260)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3172)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3040)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4297)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4277)
at org.apache.hadoop.fs.shell.PathData.expandAsGlob(PathData.java:353)
at org.apache.hadoop.fs.shell.Command.expandArgument(Command.java:250)
at org.apache.hadoop.fs.shell.Command.expandArguments(Command.java:233)
at 
org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:104)
at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
Caused by: javax.net.ssl.SSLHandshakeException: 
sun.security.validator.ValidatorException: PKIX path building failed: 

<    2   3   4   5   6   7   8   9   10   11   >