[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298

2024-05-15 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846806#comment-17846806
 ] 

Viraj Jasani commented on HADOOP-19148:
---

Build is fine, dependency tree looks good (except it has zookeeper-jute 
transitive version coming as 3.6.2 instead of 3.8.4), let me create PR to run 
the whole build with tests.
{code:java}
[INFO] +- org.apache.solr:solr-solrj:jar:8.11.3:compile
[INFO] |  +- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:compile
[INFO] |  +- commons-io:commons-io:jar:2.14.0:compile
[INFO] |  +- commons-lang:commons-lang:jar:2.6:compile
[INFO] |  +- io.netty:netty-buffer:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-codec:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-common:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-handler:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-resolver:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-transport:jar:4.1.100.Final:compile
[INFO] |  +- io.netty:netty-transport-native-epoll:jar:4.1.100.Final:compile
[INFO] |  +- 
io.netty:netty-transport-native-unix-common:jar:4.1.100.Final:compile
[INFO] |  +- org.apache.commons:commons-math3:jar:3.6.1:compile
[INFO] |  +- org.apache.httpcomponents:httpclient:jar:4.5.13:compile
[INFO] |  +- org.apache.httpcomponents:httpcore:jar:4.4.13:compile
[INFO] |  +- org.apache.httpcomponents:httpmime:jar:4.5.13:compile
[INFO] |  +- org.apache.zookeeper:zookeeper:jar:3.8.4:compile
[INFO] |  +- org.apache.zookeeper:zookeeper-jute:jar:3.6.2:compile
[INFO] |  +- org.codehaus.woodstox:stax2-api:jar:4.2.1:compile
...
... {code}

> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
> ---
>
> Key: HADOOP-19148
> URL: https://issues.apache.org/jira/browse/HADOOP-19148
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Brahma Reddy Battula
>Assignee: Viraj Jasani
>Priority: Major
>
> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298

2024-05-14 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846458#comment-17846458
 ] 

Viraj Jasani commented on HADOOP-19148:
---

[~brahmareddy] is anyone picking this up? If not, let me create the PR?

> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
> ---
>
> Key: HADOOP-19148
> URL: https://issues.apache.org/jira/browse/HADOOP-19148
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Brahma Reddy Battula
>Priority: Major
>
> Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19146:
--
Component/s: test

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
>    

[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19146:
-

 Summary: noaa-cors-pds bucket access with global endpoint fails
 Key: HADOOP-19146
 URL: https://issues.apache.org/jira/browse/HADOOP-19146
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Affects Versions: 3.4.0
Reporter: Viraj Jasani


All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket 
level. If global endpoint is configured (e.g. us-west-2), they fail to access 
to bucket.

 

Sample error:
{code:java}
org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
response to region [us-east-1].  This likely indicates that the S3 region 
configured in fs.s3a.endpoint.region does not match the AWS region containing 
the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, 
Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
    at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
    at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
    at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
    at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
    at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
    at 
org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
    at 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
    at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
 {code}
{code:java}
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: 
S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 
6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
    at 
software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43)
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93)
    at 
software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279)
    ...
    ...
    ...
    at 
software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)

[jira] [Assigned] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails

2024-04-11 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19146:
-

Assignee: Viraj Jasani

> noaa-cors-pds bucket access with global endpoint fails
> --
>
> Key: HADOOP-19146
> URL: https://issues.apache.org/jira/browse/HADOOP-19146
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> All tests accessing noaa-cors-pds use us-east-1 region, as configured at 
> bucket level. If global endpoint is configured (e.g. us-west-2), they fail to 
> access to bucket.
>  
> Sample error:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect 
> response to region [us-east-1].  This likely indicates that the S3 region 
> configured in fs.s3a.endpoint.region does not match the AWS region containing 
> the bucket.: null (Service: S3, Status Code: 301, Request ID: 
> PMRWMQC9S91CNEJR, Extended Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253)
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922)
>     at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115)
>     at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349)
>     at org.apache.hadoop.fs.Globber.glob(Globber.java:202)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281)
>     at 
> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328)
>     at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
>     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
>  {code}
> {code:java}
> Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null 
> (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended 
> Request ID: 
> 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108)
>     at 
> software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85)
>     at 
> 

[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-03-13 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825912#comment-17825912
 ] 

Viraj Jasani commented on HADOOP-19066:
---

Addendum PR: [https://github.com/apache/hadoop/pull/6624]

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible

2024-02-11 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816513#comment-17816513
 ] 

Viraj Jasani commented on HADOOP-18980:
---

Addressed edge cases with addendum PR: 
[https://github.com/apache/hadoop/pull/6546]

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.5.0, 3.4.1
>
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-08 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19066:
--
Status: Patch Available  (was: In Progress)

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"

2024-02-08 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19072:
-

Assignee: Viraj Jasani

> S3A: expand optimisations on stores with "fs.s3a.create.performance"
> 
>
> Key: HADOOP-19072
> URL: https://issues.apache.org/jira/browse/HADOOP-19072
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>
> on an s3a store with fs.s3a.create.performance set, speed up other operations
> *  mkdir to skip parent directory check: just do a HEAD to see if there's a 
> file at the target location



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"

2024-02-08 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815822#comment-17815822
 ] 

Viraj Jasani commented on HADOOP-19072:
---

The improvement makes sense, as long as downstreamer knows where they are 
creating the dir.

> S3A: expand optimisations on stores with "fs.s3a.create.performance"
> 
>
> Key: HADOOP-19072
> URL: https://issues.apache.org/jira/browse/HADOOP-19072
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> on an s3a store with fs.s3a.create.performance set, speed up other operations
> *  mkdir to skip parent directory check: just do a HEAD to see if there's a 
> file at the target location



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-05 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814576#comment-17814576
 ] 

Viraj Jasani commented on HADOOP-19066:
---

Indeed! hopefully some final stabilization work.

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814171#comment-17814171
 ] 

Viraj Jasani commented on HADOOP-19066:
---

Will run the whole suite with FIPS support + central endpoint.

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19066:
-

Assignee: Viraj Jasani

> AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
> --
>
> Key: HADOOP-19066
> URL: https://issues.apache.org/jira/browse/HADOOP-19066
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.5.0, 3.4.1
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
> considers overriding endpoint and enabling fips as mutually exclusive, we 
> fail fast if fs.s3a.endpoint is set with fips support (details on 
> HADOOP-18975).
> Now, we no longer override SDK endpoint for central endpoint since we enable 
> cross region access (details on HADOOP-19044) but we would still fail fast if 
> endpoint is central and fips is enabled.
> Changes proposed:
>  * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
> configured.
>  * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
> cross region access (expected with central endpoint).
>  * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint

2024-02-04 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19066:
-

 Summary: AWS SDK V2 - Enabling FIPS should be allowed with central 
endpoint
 Key: HADOOP-19066
 URL: https://issues.apache.org/jira/browse/HADOOP-19066
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.5.0, 3.4.1
Reporter: Viraj Jasani


FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK 
considers overriding endpoint and enabling fips as mutually exclusive, we fail 
fast if fs.s3a.endpoint is set with fips support (details on HADOOP-18975).

Now, we no longer override SDK endpoint for central endpoint since we enable 
cross region access (details on HADOOP-19044) but we would still fail fast if 
endpoint is central and fips is enabled.

Changes proposed:
 * S3A to fail fast only if FIPS is enabled and non-central endpoint is 
configured.
 * Tests to ensure S3 bucket is accessible with default region us-east-2 with 
cross region access (expected with central endpoint).
 * Document FIPS support with central endpoint on connecting.html.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure

2024-01-29 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812142#comment-17812142
 ] 

Viraj Jasani commented on HADOOP-19022:
---

It's fine [~ste...@apache.org], i anyways need to make some changes for 
updating cross region logic, so i can take care of that and also fixing timeout 
value for the current test (only if required after your PR 
[https://github.com/apache/hadoop/pull/6470)] and then add some more coverage.

Once your PR gets merged and cross region logic part is also done, i will 
re-run this with different endpoint/region settings and if needed, i will take 
care of ITestS3AConfiguration issues as part of this Jira, otherwise will close 
the Jira.

> S3A : ITestS3AConfiguration#testRequestTimeout failure
> --
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints

2024-01-22 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809636#comment-17809636
 ] 

Viraj Jasani commented on HADOOP-18975:
---

{quote}you must have set a global endpoint, rather than one for your test 
bucket -correct?
{quote}
Exactly.

> AWS SDK v2:  extend support for FIPS endpoints
> --
>
> Key: HADOOP-18975
> URL: https://issues.apache.org/jira/browse/HADOOP-18975
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0, 3.4.1
>
>
> v1 SDK supported FIPS just by changing the endpoint.
> Now we have a new builder setting to use.
> * add new  fs.s3a.endpoint.fips option
> * pass it down
> * test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints

2024-01-21 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809271#comment-17809271
 ] 

Viraj Jasani edited comment on HADOOP-18975 at 1/22/24 7:33 AM:


{code:java}
  
    fs.s3a.bucket.landsat-pds.endpoint.fips
    true
    Use the fips endpoint
   {code}
[~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in 
hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ?

 

This is breaking several tests from full suite that i am running against 
us-west-2 for PR [https://github.com/apache/hadoop/pull/6479]

e.g.
{code:java}
[ERROR] 
testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select)
  Time elapsed: 2.917 s  <<< ERROR!
java.lang.IllegalArgumentException: An endpoint cannot set when 
fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com
at 
org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366)
at 
org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304)
at 
org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) 
{code}
 

[ERROR] Tests run: 1264, Failures: 4, Errors: 87, Skipped: 164


was (Author: vjasani):
 
{code:java}
  
    fs.s3a.bucket.landsat-pds.endpoint.fips
    true
    Use the fips endpoint
   {code}
[~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in 
hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ?

 

This is breaking several tests from full suite that i am running against 
us-west-2 for PR [https://github.com/apache/hadoop/pull/6479]

e.g.
{code:java}
[ERROR] 
testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select)
  Time elapsed: 2.917 s  <<< ERROR!
java.lang.IllegalArgumentException: An endpoint cannot set when 
fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com
at 
org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366)
at 
org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304)
at 
org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) 
{code}

> AWS SDK v2:  extend support for FIPS endpoints
> --
>
> Key: HADOOP-18975
> URL: https://issues.apache.org/jira/browse/HADOOP-18975
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> v1 SDK supported FIPS just by changing the endpoint.
> Now we have a new builder setting to use.
> * add new  fs.s3a.endpoint.fips option
> * pass it down
> * test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints

2024-01-21 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809271#comment-17809271
 ] 

Viraj Jasani commented on HADOOP-18975:
---

 
{code:java}
  
    fs.s3a.bucket.landsat-pds.endpoint.fips
    true
    Use the fips endpoint
   {code}
[~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in 
hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ?

 

This is breaking several tests from full suite that i am running against 
us-west-2 for PR [https://github.com/apache/hadoop/pull/6479]

e.g.
{code:java}
[ERROR] 
testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select)
  Time elapsed: 2.917 s  <<< ERROR!
java.lang.IllegalArgumentException: An endpoint cannot set when 
fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com
at 
org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179)
at 
org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171)
at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366)
at 
org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304)
at 
org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) 
{code}

> AWS SDK v2:  extend support for FIPS endpoints
> --
>
> Key: HADOOP-18975
> URL: https://issues.apache.org/jira/browse/HADOOP-18975
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> v1 SDK supported FIPS just by changing the endpoint.
> Now we have a new builder setting to use.
> * add new  fs.s3a.endpoint.fips option
> * pass it down
> * test



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-19044) AWS SDK V2 - Update S3A region logic

2024-01-20 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-19044:
-

Assignee: Viraj Jasani

> AWS SDK V2 - Update S3A region logic 
> -
>
> Key: HADOOP-19044
> URL: https://issues.apache.org/jira/browse/HADOOP-19044
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>
> If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set 
> fs.s3a.endpoint to 
> s3.amazonaws.com here:
> [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540]
>  
>  
> HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is 
> set, or if a region can be parsed from fs.s3a.endpoint (which will happen in 
> this case, region will be US_EAST_1), cross region access is not enabled. 
> This will cause 400 errors if the bucket is not in US_EAST_1. 
>  
> Proposed: Updated the logic so that if the endpoint is the global 
> s3.amazonaws.com , cross region access is enabled.  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-07 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804132#comment-17804132
 ] 

Viraj Jasani commented on HADOOP-19023:
---

{quote} * make sure you've not got a site config with an aggressive 
timeout{quote}
Can confirm that this is not the case.
{quote} * do set version/component in the issue fields...it's not picked up 
from the parent{quote}
Sure, will keep this in mind.

 

While HADOOP-19022 has test failure that is consistent, this one 
testParallelRename is intermediate failure. It happened only when I ran the 
whole suite (-Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch), when 
the setup was connected to VPN.

Running the test individually is not failing. Since testParallelRename is 
already aggressive, I think we might want to set higher connection timeout for 
the test.

> S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
> ---
>
> Key: HADOOP-19023
> URL: https://issues.apache.org/jira/browse/HADOOP-19023
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Need to configure higher timeout for the test.
>  
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 256.281 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
> [ERROR] 
> testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
> Time elapsed: 72.565 s  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
> fork-0005/test/testParallelRename-source0: 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client 
> execution did not complete before the specified timeout configuration: 15000 
> millis
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
>   at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
>   at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
>   at 
> org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
> Client execution did not complete before the specified timeout configuration: 
> 15000 millis
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
>   at 
> 

[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure

2024-01-07 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804129#comment-17804129
 ] 

Viraj Jasani commented on HADOOP-19022:
---

 
{quote}have you explicitly set it in your site config?
{quote}
Can confirm that it is not set explicitly, this test fails consistently because 
it takes 120 as 120 ms by default, and since it is less than 15 s, so 15s is 
selected:

 
{code:java}
apiCallTimeout = enforceMinimumDuration(REQUEST_TIMEOUT,
apiCallTimeout, minimumOperationDuration); {code}
Here, minimumOperationDuration is 15s.

 

 

For this Jira, we can
 # Make the test use "120s" instead of "120" so that it will not set 15s by 
default.
 # Add a test with timeout value smaller than 15s and verify that actual 
timeout in S3A client config object is 15s.
 # Add a test by setting "0" as timeout and verify that 
SdkClientOption.API_CALL_ATTEMPT_TIMEOUT does not even get set.
 # Document "fs.s3a.connection.request.timeout" as having 15s default behavior 
if any client sets it with value > 0 and < 15s.

WDYT?

> S3A : ITestS3AConfiguration#testRequestTimeout failure
> --
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-07 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19023:
--
Component/s: test

> ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
> -
>
> Key: HADOOP-19023
> URL: https://issues.apache.org/jira/browse/HADOOP-19023
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Need to configure higher timeout for the test.
>  
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 256.281 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
> [ERROR] 
> testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
> Time elapsed: 72.565 s  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
> fork-0005/test/testParallelRename-source0: 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client 
> execution did not complete before the specified timeout configuration: 15000 
> millis
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
>   at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
>   at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
>   at 
> org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
> Client execution did not complete before the specified timeout configuration: 
> 15000 millis
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
>   at 
> 

[jira] [Updated] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure

2024-01-07 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19022:
--
Summary: S3A : ITestS3AConfiguration#testRequestTimeout failure  (was: 
ITestS3AConfiguration#testRequestTimeout failure)

> S3A : ITestS3AConfiguration#testRequestTimeout failure
> --
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-07 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-19023:
--
Summary: S3A : ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure  (was: ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure)

> S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
> ---
>
> Key: HADOOP-19023
> URL: https://issues.apache.org/jira/browse/HADOOP-19023
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Viraj Jasani
>Priority: Major
>
> Need to configure higher timeout for the test.
>  
> {code:java}
> [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 256.281 s <<< FAILURE! - in 
> org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
> [ERROR] 
> testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
> Time elapsed: 72.565 s  <<< ERROR!
> org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
> fork-0005/test/testParallelRename-source0: 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client 
> execution did not complete before the specified timeout configuration: 15000 
> millis
>   at 
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>   at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
>   at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
>   at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
>   at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
>   at 
> org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
>   at 
> org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
>   at 
> org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:750)
> Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
> Client execution did not complete before the specified timeout configuration: 
> 15000 millis
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
>   at 
> software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
>   at 
> software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
>   at 
> 

[jira] [Updated] (HADOOP-18980) S3A credential provider remapping: make extensible

2024-01-04 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18980:
--
Status: Patch Available  (was: In Progress)

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18959) Use builder for prefetch CachingBlockManager

2024-01-04 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803401#comment-17803401
 ] 

Viraj Jasani commented on HADOOP-18959:
---

[~slfan1989] this is already committed to trunk, only backport PR is pending 
for merge.

> Use builder for prefetch CachingBlockManager
> 
>
> Key: HADOOP-18959
> URL: https://issues.apache.org/jira/browse/HADOOP-18959
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) 
> have added more params for prefetch CachingBlockManager c'tor to process 
> read/write block requests. They have added too many params and more are 
> likely to be introduced later. We should use builder pattern to pass params.
> This would also help consolidating required prefetch params into one single 
> place within S3ACachingInputStream, from scattered locations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure

2024-01-03 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19023:
-

 Summary: ITestS3AConcurrentOps#testParallelRename intermittent 
timeout failure
 Key: HADOOP-19023
 URL: https://issues.apache.org/jira/browse/HADOOP-19023
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


Need to configure higher timeout for the test.

 
{code:java}
[ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 256.281 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
[ERROR] 
testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  Time 
elapsed: 72.565 s  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on 
fork-0005/test/testParallelRename-source0: 
software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution 
did not complete before the specified timeout configuration: 15000 millis
at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376)
at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372)
at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214)
at 
org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: 
Client execution did not complete before the specified timeout configuration: 
15000 millis
at 
software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97)
at 
software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at 
software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at 
software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at 
software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224)
at 

[jira] [Commented] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure

2024-01-03 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802395#comment-17802395
 ] 

Viraj Jasani commented on HADOOP-19022:
---

It's small test, but perhaps good to cover both cases: more than 15s and less 
than 15s timeouts.

> ITestS3AConfiguration#testRequestTimeout failure
> 
>
> Key: HADOOP-19022
> URL: https://issues.apache.org/jira/browse/HADOOP-19022
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Priority: Minor
>
> "fs.s3a.connection.request.timeout" should be specified in milliseconds as per
> {code:java}
> Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
> DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
> {code}
> The test fails consistently because it sets 120 ms timeout which is less than 
> 15s (min network operation duration), and hence gets reset to 15000 ms based 
> on the enforcement.
>  
> {code:java}
> [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
> Time elapsed: 0.016 s  <<< FAILURE!
> java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
> different than what AWS sdk configuration uses internally expected:<12> 
> but was:<15000>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at 
> org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure

2024-01-03 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-19022:
-

 Summary: ITestS3AConfiguration#testRequestTimeout failure
 Key: HADOOP-19022
 URL: https://issues.apache.org/jira/browse/HADOOP-19022
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


"fs.s3a.connection.request.timeout" should be specified in milliseconds as per
{code:java}
Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT,
DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); 
{code}
The test fails consistently because it sets 120 ms timeout which is less than 
15s (min network operation duration), and hence gets reset to 15000 ms based on 
the enforcement.

 
{code:java}
[ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration)  
Time elapsed: 0.016 s  <<< FAILURE!
java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is 
different than what AWS sdk configuration uses internally expected:<12> but 
was:<15000>
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failNotEquals(Assert.java:835)
at org.junit.Assert.assertEquals(Assert.java:647)
at 
org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444)
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3

2023-11-28 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790816#comment-17790816
 ] 

Viraj Jasani commented on HADOOP-18991:
---

As per HADOOP-16542, if we remove this, hive build fails. Hive can explicitly 
use common-beanutils directly?

FYI [~weichiu] 

> Remove commons-benautils dependency from Hadoop 3
> -
>
> Key: HADOOP-18991
> URL: https://issues.apache.org/jira/browse/HADOOP-18991
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Istvan Toth
>Priority: Major
>
> Hadoop doesn't acually use it, and it pollutes the classpath of dependent 
> projects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3

2023-11-28 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790788#comment-17790788
 ] 

Viraj Jasani commented on HADOOP-18991:
---

[~stoty] is this the cause for managing it in phoenix even after excluding it 
from omid?

> Remove commons-benautils dependency from Hadoop 3
> -
>
> Key: HADOOP-18991
> URL: https://issues.apache.org/jira/browse/HADOOP-18991
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common
>Reporter: Istvan Toth
>Priority: Major
>
> Hadoop doesn't acually use it, and it pollutes the classpath of dependent 
> projects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible

2023-11-20 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788128#comment-17788128
 ] 

Viraj Jasani commented on HADOOP-18980:
---

{quote}exactly; though i'd expect the remapping to be from com.amazonaws to 
software.amazonaws or private implementations

key goal: you can use the same credentials.provider list for v1 and v2 sdk 
clients.
{quote}
In addition to having same credentials.provider list for v1 and v2 sdk, maybe 
we can also remove static mapping for v1 to v2 credential providers and let new 
config have default key value pairs:

 
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
   
com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider,
   
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
   
com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider
  
 {code}
 

With this being default value, any new third-party credential provider can be 
added to this list by users. Does that sound good?

 

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18980) S3A credential provider remapping: make extensible

2023-11-20 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788128#comment-17788128
 ] 

Viraj Jasani edited comment on HADOOP-18980 at 11/20/23 6:44 PM:
-

In addition to having same credentials.provider list for v1 and v2 sdk, maybe 
we can also remove static mapping for v1 to v2 credential providers and let new 
config have default key value pairs:
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
   
com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider,
   
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
   
com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider
  
 {code}
With this being default value, any new third-party credential provider can be 
added to this list by users. Does that sound good?


was (Author: vjasani):
{quote}exactly; though i'd expect the remapping to be from com.amazonaws to 
software.amazonaws or private implementations

key goal: you can use the same credentials.provider list for v1 and v2 sdk 
clients.
{quote}
In addition to having same credentials.provider list for v1 and v2 sdk, maybe 
we can also remove static mapping for v1 to v2 credential providers and let new 
config have default key value pairs:

 
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
   
com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider,
   
com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider,
   
com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
   
com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider
  
 {code}
 

With this being default value, any new third-party credential provider can be 
added to this list by users. Does that sound good?

 

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible

2023-11-19 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787854#comment-17787854
 ] 

Viraj Jasani commented on HADOOP-18980:
---

Something like this maybe?

 
{code:java}

  fs.s3a.aws.credentials.provider.mapping
  
    
com.amazon.xyz.auth.provider.key1=org.apache.hadoop.fs.s3a.CustomCredsProvider1,
    
com.amazon.xyz.auth.provider.key2=org.apache.hadoop.fs.s3a.CustomCredsProvider2,
    
com.amazon.xyz.auth.provider.key3=org.apache.hadoop.fs.s3a.CustomCredsProvider3
  



  fs.s3a.aws.credentials.provider
  
    com.amazon.xyz.auth.provider.key1,
    com.amazon.xyz.auth.provider.key2
  
 {code}
 

 

> S3A credential provider remapping: make extensible
> --
>
> Key: HADOOP-18980
> URL: https://issues.apache.org/jira/browse/HADOOP-18980
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Priority: Minor
>
> s3afs will now remap the common com.amazonaws credential providers to 
> equivalents in the v2 sdk or in hadoop-aws
> We could do the same for third party credential providers by taking a 
> key=value list in a configuration property and adding to the map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18959) Use builder for prefetch CachingBlockManager

2023-10-29 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18959:
-

Assignee: Viraj Jasani

> Use builder for prefetch CachingBlockManager
> 
>
> Key: HADOOP-18959
> URL: https://issues.apache.org/jira/browse/HADOOP-18959
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) 
> have added more params for prefetch CachingBlockManager c'tor to process 
> read/write block requests. They have added too many params and more are 
> likely to be introduced later. We should use builder pattern to pass params.
> This would also help consolidating required prefetch params into one single 
> place within S3ACachingInputStream, from scattered locations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18959) Use builder for prefetch CachingBlockManager

2023-10-29 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18959:
-

 Summary: Use builder for prefetch CachingBlockManager
 Key: HADOOP-18959
 URL: https://issues.apache.org/jira/browse/HADOOP-18959
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani


Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) have 
added more params for prefetch CachingBlockManager c'tor to process read/write 
block requests. They have added too many params and more are likely to be 
introduced later. We should use builder pattern to pass params.

This would also help consolidating required prefetch params into one single 
place within S3ACachingInputStream, from scattered locations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used

2023-10-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ITestS3GuardTool fails if SSE/DSSE encryption is used
> -
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18952) FsCommand Stat class set the timeZone"UTC", which is different from the machine's timeZone

2023-10-26 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780014#comment-17780014
 ] 

Viraj Jasani commented on HADOOP-18952:
---

This has been the case since the beginning:

Stat:
{code:java}
protected final SimpleDateFormat timeFmt;
{
  timeFmt = new SimpleDateFormat("-MM-dd HH:mm:ss");
  timeFmt.setTimeZone(TimeZone.getTimeZone("UTC"));
}{code}
Ls:
{code:java}
protected final SimpleDateFormat dateFormat =
  new SimpleDateFormat("-MM-dd HH:mm"); {code}

> FsCommand Stat class set the timeZone"UTC", which is different from the 
> machine's timeZone
> --
>
> Key: HADOOP-18952
> URL: https://issues.apache.org/jira/browse/HADOOP-18952
> Project: Hadoop Common
>  Issue Type: Bug
> Environment: Using Hadoop 3.3.4-release
>Reporter: liang yu
>Priority: Major
> Attachments: image-2023-10-26-10-07-11-637.png
>
>
> Using Hadoop version 3.3.4
>  
> When executing Ls command and Stat command on the same hadoop file, I get two 
> timestamps.
>  
> {code:java}
> hdfs dfs -stat "modify_time %y, access_time%x" /path/to/file{code}
>  returns:
> modify_time {_}*2023-10-17 01:43:05*{_}, access_time _*2023-10-17 01:41:00*_ 
>  
> {code:java}
> hdfs dfs -ls /path/to/file{code}
>   returns:
> {-}rw{-}rw-r–+     3    user_name     user_group     247400339     
> _*2023-10-17 09:43*_     /path/to/file
>  
> these two timestamps has the difference 8hours.
> I am in China, the timezone is “UTC+8”, so the timestamp from LS command is 
> correct and timestamp from STAT command is wrong.
>  
> !image-2023-10-26-10-07-11-637.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-10-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani resolved HADOOP-18829.
---
Fix Version/s: 3.4.0
   3.3.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

> s3a prefetch LRU cache eviction metric
> --
>
> Key: HADOOP-18829
> URL: https://issues.apache.org/jira/browse/HADOOP-18829
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> Follow-up from HADOOP-18291:
> Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from

2023-10-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776356#comment-17776356
 ] 

Viraj Jasani edited comment on HADOOP-18931 at 10/17/23 7:16 PM:
-

sounds good, it makes sense to log for all fs invocation by keeping the log 
separate from the heavy service load.


was (Author: vjasani):
sounds good, it makes sense to log for all fs invocation

> FileSystem.getFileSystemClass() to log at debug the jar the .class came from
> 
>
> Key: HADOOP-18931
> URL: https://issues.apache.org/jira/browse/HADOOP-18931
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Minor
>
> we want to be able to log the jar the filesystem implementation class, so 
> that we can identify which version of a module the class came from.
> this is to help track down problems where different machines in the cluster 
> or the .tar.gz bundle is out of date. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from

2023-10-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776356#comment-17776356
 ] 

Viraj Jasani commented on HADOOP-18931:
---

sounds good, it makes sense to log for all fs invocation

> FileSystem.getFileSystemClass() to log at debug the jar the .class came from
> 
>
> Key: HADOOP-18931
> URL: https://issues.apache.org/jira/browse/HADOOP-18931
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Minor
>
> we want to be able to log the jar the filesystem implementation class, so 
> that we can identify which version of a module the class came from.
> this is to help track down problems where different machines in the cluster 
> or the .tar.gz bundle is out of date. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used

2023-10-15 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Status: Patch Available  (was: In Progress)

> ITestS3GuardTool fails if SSE/DSSE encryption is used
> -
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-10-15 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18850:
--
Status: Patch Available  (was: In Progress)

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from

2023-10-14 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17775281#comment-17775281
 ] 

Viraj Jasani commented on HADOOP-18931:
---

i thought we were already logging it during the first time init of fs for the 
given JVM
{code:java}
try {
  SERVICE_FILE_SYSTEMS.put(fs.getScheme(), fs.getClass());
  if (LOGGER.isDebugEnabled()) {
LOGGER.debug("{}:// = {} from {}",
fs.getScheme(), fs.getClass(),
ClassUtil.findContainingJar(fs.getClass()));
  }
} catch (Exception e) {
  LOGGER.warn("Cannot load: {} from {}", fs,
  ClassUtil.findContainingJar(fs.getClass()));
  LOGGER.info("Full exception loading: {}", fs, e);
}
{code}
maybe you are suggesting that we should log it for every call to 
{_}getFileSystemClass(){_}, correct?

> FileSystem.getFileSystemClass() to log at debug the jar the .class came from
> 
>
> Key: HADOOP-18931
> URL: https://issues.apache.org/jira/browse/HADOOP-18931
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Minor
>
> we want to be able to log the jar the filesystem implementation class, so 
> that we can identify which version of a module the class came from.
> this is to help track down problems where different machines in the cluster 
> or the .tar.gz bundle is out of date. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used

2023-10-09 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Summary: ITestS3GuardTool fails if SSE/DSSE encryption is used  (was: 
ITestS3GuardTool fails if SSE encryption is used)

> ITestS3GuardTool fails if SSE/DSSE encryption is used
> -
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used

2023-10-02 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18918:
--
Priority: Minor  (was: Major)

> ITestS3GuardTool fails if SSE encryption is used
> 
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used

2023-10-02 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18918:
-

Assignee: Viraj Jasani

> ITestS3GuardTool fails if SSE encryption is used
> 
>
> Key: HADOOP-18918
> URL: https://issues.apache.org/jira/browse/HADOOP-18918
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.6
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> {code:java}
> [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
> [ERROR] 
> testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
>   Time elapsed: 0.807 s  <<< ERROR!
> 46: Bucket s3a://landsat-pds: required encryption is none but actual 
> encryption is DSSE-KMS
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
>     at 
> org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>     at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.lang.Thread.run(Thread.java:750)
>  {code}
> Since landsat requires none encryption, the test should be skipped for any 
> encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used

2023-10-02 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18918:
-

 Summary: ITestS3GuardTool fails if SSE encryption is used
 Key: HADOOP-18918
 URL: https://issues.apache.org/jira/browse/HADOOP-18918
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3, test
Affects Versions: 3.3.6
Reporter: Viraj Jasani


{code:java}
[ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 25.989 
s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool
[ERROR] 
testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool)
  Time elapsed: 0.807 s  <<< ERROR!
46: Bucket s3a://landsat-pds: required encryption is none but actual encryption 
is DSSE-KMS
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915)
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881)
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511)
    at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
    at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963)
    at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147)
    at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114)
    at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
    at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
    at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.lang.Thread.run(Thread.java:750)
 {code}
Since landsat requires none encryption, the test should be skipped for any 
encryption algorithm.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-09-30 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18850:
-

Assignee: Viraj Jasani

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Assignee: Viraj Jasani
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18915) HTTP timeouts are not set correctly

2023-09-30 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770745#comment-17770745
 ] 

Viraj Jasani commented on HADOOP-18915:
---

Nice find!

> HTTP timeouts are not set correctly
> ---
>
> Key: HADOOP-18915
> URL: https://issues.apache.org/jira/browse/HADOOP-18915
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Priority: Major
>
> In the client config builders, when [setting 
> timeouts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSClientConfig.java#L120],
>  it uses Duration.ofSeconds(), configs all use milliseconds so this needs to 
> be updated to Duration.ofMillis().
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18208) Remove all the log4j reference in modules other than hadoop-logging

2023-09-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18208:
-

Assignee: (was: Viraj Jasani)

> Remove all the log4j reference in modules other than hadoop-logging
> ---
>
> Key: HADOOP-18208
> URL: https://issues.apache.org/jira/browse/HADOOP-18208
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2

2023-09-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-16206:
-

Assignee: (was: Viraj Jasani)

> Migrate from Log4j1 to Log4j2
> -
>
> Key: HADOOP-16206
> URL: https://issues.apache.org/jira/browse/HADOOP-16206
> Project: Hadoop Common
>  Issue Type: Task
>Affects Versions: 3.3.0
>Reporter: Akira Ajisaka
>Priority: Major
> Attachments: HADOOP-16206-wip.001.patch
>
>
> This sub-task is to remove log4j1 dependency and add log4j2 dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18207) Introduce hadoop-logging module

2023-09-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18207:
-

Assignee: (was: Viraj Jasani)

> Introduce hadoop-logging module
> ---
>
> Key: HADOOP-18207
> URL: https://issues.apache.org/jira/browse/HADOOP-18207
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There are several goals here:
>  # Provide the ability to change log level, get log level, etc.
>  # Place all the appender implementation(?)
>  # Hide the real logging implementation.
>  # Later we could remove all the log4j references in other hadoop module.
>  # Move as much log4j usage to the module as possible.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-15984) Update jersey from 1.19 to 2.x

2023-09-19 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-15984:
-

Assignee: (was: Viraj Jasani)

> Update jersey from 1.19 to 2.x
> --
>
> Key: HADOOP-15984
> URL: https://issues.apache.org/jira/browse/HADOOP-15984
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Akira Ajisaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755406#comment-17755406
 ] 

Viraj Jasani commented on HADOOP-18850:
---

[~ste...@apache.org] are you in favor of this before v2 sdk upgrade?

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755394#comment-17755394
 ] 

Viraj Jasani commented on HADOOP-18850:
---

only recently HADOOP-18832 bumped sdk bundle to 1.12.499, so looks like we can 
support this

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755392#comment-17755392
 ] 

Viraj Jasani edited comment on HADOOP-18850 at 8/17/23 7:13 AM:


it seems SSEAlgorithm added DSSE as part of 1.12.488 release: 
[https://github.com/aws/aws-sdk-java/releases/tag/1.12.488]
{code:java}
public enum SSEAlgorithm {
AES256("AES256"),
KMS("aws:kms"),
DSSE("aws:kms:dsse"),
;{code}


was (Author: vjasani):
SSEAlgorithm added DSSE as part of 1.12.488 release: 
[https://github.com/aws/aws-sdk-java/releases/tag/1.12.488]
{code:java}
public enum SSEAlgorithm {
AES256("AES256"),
KMS("aws:kms"),
DSSE("aws:kms:dsse"),
;{code}

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)

2023-08-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755392#comment-17755392
 ] 

Viraj Jasani commented on HADOOP-18850:
---

SSEAlgorithm added DSSE as part of 1.12.488 release: 
[https://github.com/aws/aws-sdk-java/releases/tag/1.12.488]
{code:java}
public enum SSEAlgorithm {
AES256("AES256"),
KMS("aws:kms"),
DSSE("aws:kms:dsse"),
;{code}

> Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
> -
>
> Key: HADOOP-18850
> URL: https://issues.apache.org/jira/browse/HADOOP-18850
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, security
>Reporter: Akira Ajisaka
>Priority: Major
>
> Add support for DSSE-KMS
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO

2023-08-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755385#comment-17755385
 ] 

Viraj Jasani commented on HADOOP-18852:
---

{quote}for other reads, we may want a bigger prefech count than 1, depending 
on: split start/end, file read policy (random, sequential, whole-file)
{quote}
this means we first need prefetch read policy (HADOOP-18791), correct?

> S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look 
> like random IO
> --
>
> Key: HADOOP-18852
> URL: https://issues.apache.org/jira/browse/HADOOP-18852
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with 
> separately.
> # all seeks are lazy; no fetching is kicked off after an open
> # the first read is treated as an out of order read, so cancels any active 
> reads (don't think there are any) and then only asks for 1 block
> {code}
> if (outOfOrderRead) {
>   LOG.debug("lazy-seek({})", getOffsetStr(readPos));
>   blockManager.cancelPrefetches();
>   // We prefetch only 1 block immediately after a seek operation.
>   prefetchCount = 1;
> }
> {code}
> * for any read fully we should prefetch all blocks in the range requested
> * for other reads, we may want a bigger prefech count than 1, depending on: 
> split start/end, file read policy (random, sequential, whole-file)
> * also, if a read is in a block other than the current one, but which is 
> already being fetched or cached, is this really an OOO read to the extent 
> that outstanding fetches should be cancelled?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO

2023-08-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755384#comment-17755384
 ] 

Viraj Jasani commented on HADOOP-18852:
---

{quote}also, if a read is in a block other than the current one, but which is 
already being fetched or cached, is this really an OOO read to the extent that 
outstanding fetches should be cancelled?
{quote}
+1 to this, now that i checked some logs, can see lazy-seek for every first 
seek + read on the given block:
{code:java}
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(0:0)
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(4:40960)
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(3:30720)
DEBUG prefetch.S3ACachingInputStream 
(S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(2:20480){code}
but it's also valid that if the block was being cached, why cancel the 
outstanding fetches.

> S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look 
> like random IO
> --
>
> Key: HADOOP-18852
> URL: https://issues.apache.org/jira/browse/HADOOP-18852
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with 
> separately.
> # all seeks are lazy; no fetching is kicked off after an open
> # the first read is treated as an out of order read, so cancels any active 
> reads (don't think there are any) and then only asks for 1 block
> {code}
> if (outOfOrderRead) {
>   LOG.debug("lazy-seek({})", getOffsetStr(readPos));
>   blockManager.cancelPrefetches();
>   // We prefetch only 1 block immediately after a seek operation.
>   prefetchCount = 1;
> }
> {code}
> * for any read fully we should prefetch all blocks in the range requested
> * for other reads, we may want a bigger prefech count than 1, depending on: 
> split start/end, file read policy (random, sequential, whole-file)
> * also, if a read is in a block other than the current one, but which is 
> already being fetched or cached, is this really an OOO read to the extent 
> that outstanding fetches should be cancelled?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-08-01 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750035#comment-17750035
 ] 

Viraj Jasani commented on HADOOP-18829:
---

sure thing, i think this can wait for sure. thanks

> s3a prefetch LRU cache eviction metric
> --
>
> Key: HADOOP-18829
> URL: https://issues.apache.org/jira/browse/HADOOP-18829
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Follow-up from HADOOP-18291:
> Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748981#comment-17748981
 ] 

Viraj Jasani commented on HADOOP-18832:
---

ITestS3AFileContextStatistics#testStatistics is flaky:
{code:java}
[ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.983 s 
<<< FAILURE! - in 
org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics
[ERROR] 
testStatistics(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics)
  Time elapsed: 1.776 s  <<< FAILURE!
java.lang.AssertionError: expected:<512> but was:<448>
    at org.junit.Assert.fail(Assert.java:89)
    at org.junit.Assert.failNotEquals(Assert.java:835)
    at org.junit.Assert.assertEquals(Assert.java:647)
    at org.junit.Assert.assertEquals(Assert.java:633)
    at 
org.apache.hadoop.fs.FCStatisticsBaseTest.testStatistics(FCStatisticsBaseTest.java:108)
 {code}
This only happened once, now unable to reproduce it locally.

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748980#comment-17748980
 ] 

Viraj Jasani commented on HADOOP-18832:
---

Testing in progress: Test results look good with -scale and -prefetch so far.

Now running some encryption tests (bucket with algo: SSE-KMS).

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18832:
--
Description: aws sdk versions < 1.12.499 uses a vulnerable version of netty 
and hence showing up in security CVE scans (CVE-2023-34462). The safe version 
for netty is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+  (was: aws 
sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up 
in security CVE scans (CVE-2023-34462). The safe version for netty is 
4.1.94.Final and this is used by aws-java-adk:1.12.499+)

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18832:
-

Assignee: Viraj Jasani

> Upgrade aws-java-sdk to 1.12.499+
> -
>
> Key: HADOOP-18832
> URL: https://issues.apache.org/jira/browse/HADOOP-18832
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
> showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
> is 4.1.94.Final and this is used by aws-java-adk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+

2023-07-30 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18832:
-

 Summary: Upgrade aws-java-sdk to 1.12.499+
 Key: HADOOP-18832
 URL: https://issues.apache.org/jira/browse/HADOOP-18832
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Viraj Jasani


aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence 
showing up in security CVE scans (CVE-2023-34462). The safe version for netty 
is 4.1.94.Final and this is used by aws-java-adk:1.12.499+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18829) s3a prefetch LRU cache eviction metric

2023-07-26 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18829:
-

 Summary: s3a prefetch LRU cache eviction metric
 Key: HADOOP-18829
 URL: https://issues.apache.org/jira/browse/HADOOP-18829
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


Follow-up from HADOOP-18291:

Add new IO statistics metric to capture s3a prefetch LRU cache eviction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18809) s3a prefetch read/write file operations should guard channel close

2023-07-17 Thread Viraj Jasani (Jira)
Viraj Jasani created HADOOP-18809:
-

 Summary: s3a prefetch read/write file operations should guard 
channel close
 Key: HADOOP-18809
 URL: https://issues.apache.org/jira/browse/HADOOP-18809
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Viraj Jasani
Assignee: Viraj Jasani


As per Steve's suggestion from s3a prefetch LRU cache,

s3a prefetch disk based cache file read and write operations should guard 
against close of FileChannel and WritableByteChannel, close them even if 
read/write operations throw IOException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743344#comment-17743344
 ] 

Viraj Jasani edited comment on HADOOP-18805 at 7/17/23 8:15 PM:


sorry Steve, i was not aware you already created this Jira, i created PR for 
letting LRU tests use small files rather than landsat: 
[https://github.com/apache/hadoop/pull/5851]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). i can take this 
up as new sub-task and for the current Jira, we can focus on tests using small 
files for the better break-down?

 

PR review discussion: 
[https://github.com/apache/hadoop/pull/5754#discussion_r1247476231]


was (Author: vjasani):
sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with?
 * the above addendum PR for using small file in the test (so that we don't 
need to put the test under -scale)
 * this Jira to refactor Entry and allowing a UT to test the contents of the 
linked list

 

if you think above PR is not good for an addendum and should rather be linked 
to this Jira, i can change PR title to reflect this Jira number and i can 
create another sub-task to write simple UT that can test contents of the linked 
list from head to tail.

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-15 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743344#comment-17743344
 ] 

Viraj Jasani edited comment on HADOOP-18805 at 7/15/23 6:48 AM:


sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with?
 * the above addendum PR for using small file in the test (so that we don't 
need to put the test under -scale)
 * this Jira to refactor Entry and allowing a UT to test the contents of the 
linked list

 

if you think above PR is not good for an addendum and should rather be linked 
to this Jira, i can change PR title to reflect this Jira number and i can 
create another sub-task to write simple UT that can test contents of the linked 
list from head to tail.


was (Author: vjasani):
sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with the above addendum PR taking care of using small file in the 
test (so that we don't need to put the test under -scale) and this Jira being 
used for refactoring Entry and allowing a UT to test the contents of the linked 
list?

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Priority: Major
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-14 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743344#comment-17743344
 ] 

Viraj Jasani commented on HADOOP-18805:
---

sorry Steve, i was not aware you already created this Jira, i created addendum 
for letting LRU test depend on small file rather than large one: 
[https://github.com/apache/hadoop/pull/5843]
{quote}also, and this is very, very important, they can't validate the data
{quote}
i was about to create a sub-task for this as i am planning to refactor Entry to 
it's own class and have the contents of the linked list data tested in UT 
(discussed with Mehakmeet in the earlier part of the review). maybe i can do 
the work as part of this Jira.

 

are you fine with the above addendum PR taking care of using small file in the 
test (so that we don't need to put the test under -scale) and this Jira being 
used for refactoring Entry and allowing a UT to test the contents of the linked 
list?

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Priority: Major
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-14 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18805:
-

Assignee: (was: Viraj Jasani)

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Priority: Major
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data

2023-07-14 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18805:
-

Assignee: Viraj Jasani

> s3a large file prefetch tests are too slow, don't validate data
> ---
>
> Key: HADOOP-18805
> URL: https://issues.apache.org/jira/browse/HADOOP-18805
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>
> the large file prefetch tests (including LRU cache eviction) are really slow.
> moving under -scale may hide the problem for most runs, but they are still 
> too slow, can time out, etc etc.
> also, and this is very, very important, they can't validate the data.
> Better: 
> * test on smaller files by setting a very small block size (1k bytes or less) 
> just to force paged reads of a small 16k file.
> * with known contents to the values of all forms of read can be validated
> * maybe the LRU tests can work with a fake remote object which can then be 
> used in a unit test
> * extend one of the huge file tests to read from there -including s3-CSE 
> encryption coverage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18791) S3A prefetching: switch to prefetching for chosen read policies

2023-07-12 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742667#comment-17742667
 ] 

Viraj Jasani commented on HADOOP-18791:
---

sounds good, i just realized unbuffer is already in-progress 
[https://github.com/apache/hadoop/pull/5832]

> S3A prefetching: switch to prefetching for chosen read policies
> ---
>
> Key: HADOOP-18791
> URL: https://issues.apache.org/jira/browse/HADOOP-18791
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> before switching to prefetching input stream everywhere, add an option to 
> list which of the fs.option.openfile.read.policy policies to switch too, e.g
>  
> fs.s3a.inputstream.prefetch.policies=whole-file, sequential, adaptive
> this would leave random and vectored on s3a input stream.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18791) S3A prefetching: switch to prefetching for chosen read policies

2023-07-12 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18791:
-

Assignee: (was: Viraj Jasani)

> S3A prefetching: switch to prefetching for chosen read policies
> ---
>
> Key: HADOOP-18791
> URL: https://issues.apache.org/jira/browse/HADOOP-18791
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Priority: Major
>
> before switching to prefetching input stream everywhere, add an option to 
> list which of the fs.option.openfile.read.policy policies to switch too, e.g
>  
> fs.s3a.inputstream.prefetch.policies=whole-file, sequential, adaptive
> this would leave random and vectored on s3a input stream.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18791) S3A prefetching: switch to prefetching for chosen read policies

2023-07-12 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18791:
-

Assignee: Viraj Jasani

> S3A prefetching: switch to prefetching for chosen read policies
> ---
>
> Key: HADOOP-18791
> URL: https://issues.apache.org/jira/browse/HADOOP-18791
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.6
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>
> before switching to prefetching input stream everywhere, add an option to 
> list which of the fs.option.openfile.read.policy policies to switch too, e.g
>  
> fs.s3a.inputstream.prefetch.policies=whole-file, sequential, adaptive
> this would leave random and vectored on s3a input stream.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache

2023-06-29 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738379#comment-17738379
 ] 

Viraj Jasani commented on HADOOP-18291:
---

[~ste...@apache.org] if you have bandwidth to review: 
[https://github.com/apache/hadoop/pull/5754]

Thank you!

> S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
> --
>
> Key: HADOOP-18291
> URL: https://issues.apache.org/jira/browse/HADOOP-18291
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Currently there is no limit on the size of disk cache. This means we could 
> have a large number of files on files, especially for access patterns that 
> are very random and do not always read the block fully. 
>  
> eg:
> in.seek(5);
> in.read(); 
> in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read
> in.read();
> in.seek(2 * blockSize + 10) // block 1 gets saved to disk
> .. and so on
>  
> The in memory cache is bounded, and by default has a limit of 72MB (9 
> blocks). When a block is fully read, and a seek is issued it's released 
> [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109].
>  We can also delete the on disk file for the block here if it exists. 
>  
> Also maybe add an upper limit on disk space, and delete the file which stores 
> data of the block furthest from the current block (similar to the in memory 
> cache) when this limit is reached. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache

2023-06-27 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18291:
--
Status: Patch Available  (was: In Progress)

> S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
> --
>
> Key: HADOOP-18291
> URL: https://issues.apache.org/jira/browse/HADOOP-18291
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> Currently there is no limit on the size of disk cache. This means we could 
> have a large number of files on files, especially for access patterns that 
> are very random and do not always read the block fully. 
>  
> eg:
> in.seek(5);
> in.read(); 
> in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read
> in.read();
> in.seek(2 * blockSize + 10) // block 1 gets saved to disk
> .. and so on
>  
> The in memory cache is bounded, and by default has a limit of 72MB (9 
> blocks). When a block is fully read, and a seek is issued it's released 
> [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109].
>  We can also delete the on disk file for the block here if it exists. 
>  
> Also maybe add an upper limit on disk space, and delete the file which stores 
> data of the block furthest from the current block (similar to the in memory 
> cache) when this limit is reached. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18756) CachingBlockManager to use AtomicBoolean for closed flag

2023-06-26 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737387#comment-17737387
 ] 

Viraj Jasani commented on HADOOP-18756:
---

Steve, could you please help close this Jira? am i allowed to do it?

> CachingBlockManager to use AtomicBoolean for closed flag
> 
>
> Key: HADOOP-18756
> URL: https://issues.apache.org/jira/browse/HADOOP-18756
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> the {{CachingBlockManager}} uses the boolean field {{closed)) in various 
> operations, including a do/while loop. to ensure the flag is correctly 
> updated across threads, it needs to move to an atomic boolean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18777) Update jackson2 version from 2.12.7.1 to 2.15.0

2023-06-20 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735343#comment-17735343
 ] 

Viraj Jasani commented on HADOOP-18777:
---

please take a look at the discussion on HADOOP-18033

> Update jackson2 version from 2.12.7.1 to 2.15.0
> ---
>
> Key: HADOOP-18777
> URL: https://issues.apache.org/jira/browse/HADOOP-18777
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: ronan doolan
>Priority: Major
>
> can the jackson2 version in hadoop-project be updated from 2.12.7.1 to 2.15.*
> This is to rectify the following vulnerability
> [https://github.com/FasterXML/jackson-core/pull/827]
> https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache

2023-06-17 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721569#comment-17721569
 ] 

Viraj Jasani edited comment on HADOOP-18291 at 6/17/23 6:48 AM:


{quote}you'd maybe want a block cache - readers would lock their block before a 
read; unlock after. Use an LRU policy for recycling blocks, with unbuffer/close 
releasing all blocks of a caller.
{quote}
-if jobs using s3a prefetching get aborted without calling s3afs#close, and 
prefetched block files are kept on EBS volumes that could be accessed again by 
new vm instance or container that resume the jobs, we might also want to 
consider deleting all old local block files as part of s3afs#initialize-


was (Author: vjasani):
{quote}you'd maybe want a block cache - readers would lock their block before a 
read; unlock after. Use an LRU policy for recycling blocks, with unbuffer/close 
releasing all blocks of a caller.
{quote}
if jobs using s3a prefetching get aborted without calling s3afs#close, and 
prefetched block files are kept on EBS volumes that could be accessed again by 
new vm instance or container that resume the jobs, we might also want to 
consider deleting all old local block files as part of s3afs#initialize

> S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
> --
>
> Key: HADOOP-18291
> URL: https://issues.apache.org/jira/browse/HADOOP-18291
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>
> Currently there is no limit on the size of disk cache. This means we could 
> have a large number of files on files, especially for access patterns that 
> are very random and do not always read the block fully. 
>  
> eg:
> in.seek(5);
> in.read(); 
> in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read
> in.read();
> in.seek(2 * blockSize + 10) // block 1 gets saved to disk
> .. and so on
>  
> The in memory cache is bounded, and by default has a limit of 72MB (9 
> blocks). When a block is fully read, and a seek is issued it's released 
> [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109].
>  We can also delete the on disk file for the block here if it exists. 
>  
> Also maybe add an upper limit on disk space, and delete the file which stores 
> data of the block furthest from the current block (similar to the in memory 
> cache) when this limit is reached. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache

2023-06-15 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18291:
--
Summary: S3A prefetch - Implement LRU cache for SingleFilePerBlockCache  
(was: SingleFilePerBlockCache does not have a limit)

> S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
> --
>
> Key: HADOOP-18291
> URL: https://issues.apache.org/jira/browse/HADOOP-18291
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>
> Currently there is no limit on the size of disk cache. This means we could 
> have a large number of files on files, especially for access patterns that 
> are very random and do not always read the block fully. 
>  
> eg:
> in.seek(5);
> in.read(); 
> in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read
> in.read();
> in.seek(2 * blockSize + 10) // block 1 gets saved to disk
> .. and so on
>  
> The in memory cache is bounded, and by default has a limit of 72MB (9 
> blocks). When a block is fully read, and a seek is issued it's released 
> [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109].
>  We can also delete the on disk file for the block here if it exists. 
>  
> Also maybe add an upper limit on disk space, and delete the file which stores 
> data of the block furthest from the current block (similar to the in memory 
> cache) when this limit is reached. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+

2023-06-12 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731791#comment-17731791
 ] 

Viraj Jasani commented on HADOOP-18763:
---

this time, without vpn, all tests passed for prefetch profile as well (previous 
failures testParallelRename and testThreadPoolCoolDown are no longer showing up 
with full test run)

 
{code:java}
mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch {code}
 

 

> Upgrade aws-java-sdk to 1.12.367+
> -
>
> Key: HADOOP-18763
> URL: https://issues.apache.org/jira/browse/HADOOP-18763
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>
> aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is 
> pulling in high severity CVE and creating unhappiness in security scans, even 
> if s3a doesn't use that lib. 
> The safe version for netty is netty:4.1.86.Final and this is used by 
> aws-java-adk:1.12.367+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+

2023-06-12 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731782#comment-17731782
 ] 

Viraj Jasani commented on HADOOP-18763:
---

mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale

 

results are quite good, no test failures (except for known failure of 
testRecursiveRootListing, which passes when run individually)

> Upgrade aws-java-sdk to 1.12.367+
> -
>
> Key: HADOOP-18763
> URL: https://issues.apache.org/jira/browse/HADOOP-18763
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>
> aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is 
> pulling in high severity CVE and creating unhappiness in security scans, even 
> if s3a doesn't use that lib. 
> The safe version for netty is netty:4.1.86.Final and this is used by 
> aws-java-adk:1.12.367+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+

2023-06-12 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731766#comment-17731766
 ] 

Viraj Jasani commented on HADOOP-18763:
---

us-west-2:

 

mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

 

errors so far:
{code:java}
[ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 
1,920.089 s <<< FAILURE! - in 
org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps
[ERROR] 
testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  Time 
elapsed: 960.003 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 96 
milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:537)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:88)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.putObject(S3ABlockOutputStream.java:628)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:428)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at 
org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.parallelRenames(ITestS3AConcurrentOps.java:112)
at 
org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.testParallelRename(ITestS3AConcurrentOps.java:177)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)


[ERROR] 
testThreadPoolCoolDown(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps)  
Time elapsed: 960.005 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 96 
milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:537)
at 
org.apache.hadoop.thirdparty.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:88)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.putObject(S3ABlockOutputStream.java:628)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:428)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
at 
org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.parallelRenames(ITestS3AConcurrentOps.java:112)
at 
org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.testThreadPoolCoolDown(ITestS3AConcurrentOps.java:189)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)

[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+

2023-06-12 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731672#comment-17731672
 ] 

Viraj Jasani commented on HADOOP-18763:
---

sure thing, let me test 1.12.367 version today. can perform some manual testing 
and then do full test run with combination of scale and prefetch profiles.

first, i can make it with trunk and once results are good, can repeat the same 
tests for 3.3.

> Upgrade aws-java-sdk to 1.12.367+
> -
>
> Key: HADOOP-18763
> URL: https://issues.apache.org/jira/browse/HADOOP-18763
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>
> aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is 
> pulling in high severity CVE and creating unhappiness in security scans, even 
> if s3a doesn't use that lib. 
> The safe version for netty is netty:4.1.86.Final and this is used by 
> aws-java-adk:1.12.367+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+

2023-06-12 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731662#comment-17731662
 ] 

Viraj Jasani commented on HADOOP-18763:
---

[~weichiu] i can help run full test suite with various options if you would 
like, i anyways run tests on a regular basis.

> Upgrade aws-java-sdk to 1.12.367+
> -
>
> Key: HADOOP-18763
> URL: https://issues.apache.org/jira/browse/HADOOP-18763
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>
> aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is 
> pulling in high severity CVE and creating unhappiness in security scans, even 
> if s3a doesn't use that lib. 
> The safe version for netty is netty:4.1.86.Final and this is used by 
> aws-java-adk:1.12.367+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+

2023-06-09 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731132#comment-17731132
 ] 

Viraj Jasani commented on HADOOP-18763:
---

we were excluding netty from aws-sdk?
{code:java}

  com.amazonaws
  aws-java-sdk-bundle
  ${aws-java-sdk.version}
  

  io.netty
  *

  
{code}

> Upgrade aws-java-sdk to 1.12.367+
> -
>
> Key: HADOOP-18763
> URL: https://issues.apache.org/jira/browse/HADOOP-18763
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.5
>Reporter: Steve Loughran
>Priority: Major
>
> aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is 
> pulling in high severity CVE and creating unhappiness in security scans, even 
> if s3a doesn't use that lib. 
> The safe version for netty is netty:4.1.86.Final and this is used by 
> aws-java-adk:1.12.367+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18207) Introduce hadoop-logging module

2023-06-04 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729168#comment-17729168
 ] 

Viraj Jasani commented on HADOOP-18207:
---

The PR also has higher than usual chances of getting merge conflicts due to the 
nature of the change. Hence longer it stays open, more merge conflict 
resolutions are required, that's what happened with previous open PR (5503) for 
the past 2+ months. Just stating this as a fact (JFYI), doesn't mean i hate 
resolving conflicts :)

> Introduce hadoop-logging module
> ---
>
> Key: HADOOP-18207
> URL: https://issues.apache.org/jira/browse/HADOOP-18207
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There are several goals here:
>  # Provide the ability to change log level, get log level, etc.
>  # Place all the appender implementation(?)
>  # Hide the real logging implementation.
>  # Later we could remove all the log4j references in other hadoop module.
>  # Move as much log4j usage to the module as possible.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18207) Introduce hadoop-logging module

2023-06-04 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729167#comment-17729167
 ] 

Viraj Jasani commented on HADOOP-18207:
---

created PR with addendum included [https://github.com/apache/hadoop/pull/5717]

Ayush sir, sorry you had to do this but you could have given a little more time 
considering Wei-Chiu's timezone still has weekend (ever since PR was merged)? :)

anyways, no worries now we have new PR so we should get jenkins results in 24 
hr (as the changes are across almost all modules)

 
{quote}Side Note: Good to check the Jenkins results usually before merging 
despite any external/trust factors
{quote}
This was an oversight from my side, not from any reviewers. What happens when 
we have full QA results, we see _*3 mapreduce tests and 1 hdfs 
(TestDirectoryScanner) failures*_ most of the times, hence 4 test classes would 
usually be present but sometimes TestDirectoryScanner would not be present so 
we might see only 3 test classes in failures.

When the last QA results came (before PR merge, with latest merge conflict 
resolution), somehow i looked at 3 mapreduce failures and 1 hdfs failure but 
didn't realize that the recent failure is not TestDirectoryScanner and rather 
it's rbf test failure. The next QA results were posted on PR after the PR was 
merged, and that is when i immediately realized that we have a new test class 
and that is not TestDirectoryScanner and hence created addendum PR. I hope this 
explanation helps.

> Introduce hadoop-logging module
> ---
>
> Key: HADOOP-18207
> URL: https://issues.apache.org/jira/browse/HADOOP-18207
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There are several goals here:
>  # Provide the ability to change log level, get log level, etc.
>  # Place all the appender implementation(?)
>  # Hide the real logging implementation.
>  # Later we could remove all the log4j references in other hadoop module.
>  # Move as much log4j usage to the module as possible.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18207) Introduce hadoop-logging module

2023-06-02 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728919#comment-17728919
 ] 

Viraj Jasani commented on HADOOP-18207:
---

PR to fix the broken test: [https://github.com/apache/hadoop/pull/5713]

Commented on the original PR as well to link the issue and fix 
[https://github.com/apache/hadoop/pull/5503#issuecomment-1574535578]

Thanks

> Introduce hadoop-logging module
> ---
>
> Key: HADOOP-18207
> URL: https://issues.apache.org/jira/browse/HADOOP-18207
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>
> There are several goals here:
>  # Provide the ability to change log level, get log level, etc.
>  # Place all the appender implementation(?)
>  # Hide the real logging implementation.
>  # Later we could remove all the log4j references in other hadoop module.
>  # Move as much log4j usage to the module as possible.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18756) CachingBlockManager to use AtomicBoolean for closed flag

2023-05-31 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18756:
-

Assignee: Viraj Jasani

> CachingBlockManager to use AtomicBoolean for closed flag
> 
>
> Key: HADOOP-18756
> URL: https://issues.apache.org/jira/browse/HADOOP-18756
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Major
>
> the {{CachingBlockManager}} uses the boolean field {{closed)) in various 
> operations, including a do/while loop. to ensure the flag is correctly 
> updated across threads, it needs to move to an atomic boolean.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-18740) s3a prefetch cache blocks should be accessed by RW locks

2023-05-22 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HADOOP-18740:
--
Status: Patch Available  (was: In Progress)

> s3a prefetch cache blocks should be accessed by RW locks
> 
>
> Key: HADOOP-18740
> URL: https://issues.apache.org/jira/browse/HADOOP-18740
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> In order to implement LRU or LFU based cache removal policies for s3a 
> prefetched cache blocks, it is important for all cache reader threads to 
> acquire read lock and similarly cache file removal mechanism (fs close or 
> cache eviction) to acquire write lock before accessing the files.
> As we maintain the block entries in an in-memory map, we should be able to 
> introduce read-write lock per cache file entry, we don't need coarse-grained 
> lock shared by all entries.
>  
> This is a prerequisite to HADOOP-18291.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long

2023-05-19 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724060#comment-17724060
 ] 

Viraj Jasani edited comment on HADOOP-18744 at 5/19/23 6:01 AM:


Came across few more tests failures while testing HADOOP-18740:
{code:java}
[ERROR] 
testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations)
  Time elapsed: 2.763 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
 {code}
{code:java}
[ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk)  
Time elapsed: 2.329 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code}
{code:java}
[ERROR] 
testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer)  
Time elapsed: 1.937 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
 {code}
{code:java}
[ERROR] 
testDeleteNonExistingFileInDir(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI)
  Time elapsed: 1.809 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerCreateFile(S3AFileSystem.java:1891) 
{code}


was (Author: vjasani):
A couple more relevant failures:
{code:java}
[ERROR] 
testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations)
  Time elapsed: 2.763 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
 {code}
{code:java}
[ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk)  
Time elapsed: 2.329 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code}
{code:java}
[ERROR] 
testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer)  
Time elapsed: 1.937 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
 

[jira] [Comment Edited] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long

2023-05-18 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724060#comment-17724060
 ] 

Viraj Jasani edited comment on HADOOP-18744 at 5/19/23 12:07 AM:
-

A couple more relevant failures:
{code:java}
[ERROR] 
testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations)
  Time elapsed: 2.763 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
 {code}
{code:java}
[ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk)  
Time elapsed: 2.329 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code}
{code:java}
[ERROR] 
testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer)  
Time elapsed: 1.937 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
 {code}
{code:java}
[ERROR] 
testDeleteNonExistingFileInDir(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI)
  Time elapsed: 1.809 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.innerCreateFile(S3AFileSystem.java:1891) 
{code}


was (Author: vjasani):
A couple more relevant failures:
{code:java}
[ERROR] 
testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations)
  Time elapsed: 2.763 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
 {code}
{code:java}
[ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk)  
Time elapsed: 2.329 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code}
{code:java}
[ERROR] 
testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer)  
Time elapsed: 1.937 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 

[jira] [Commented] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long

2023-05-18 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724060#comment-17724060
 ] 

Viraj Jasani commented on HADOOP-18744:
---

A couple more relevant failures:
{code:java}
[ERROR] 
testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations)
  Time elapsed: 2.763 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235)
at 
org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217)
 {code}
{code:java}
[ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk)  
Time elapsed: 2.329 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code}
{code:java}
[ERROR] 
testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer)  
Time elapsed: 1.937 s  <<< ERROR!
java.io.IOException: File name too long
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:2063)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)
at 
org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)
at 
org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)
 {code}

> ITestS3ABlockOutputArray failure with IO File name too long
> ---
>
> Key: HADOOP-18744
> URL: https://issues.apache.org/jira/browse/HADOOP-18744
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Ahmar Suhail
>Priority: Major
>
> On an EC2 instance, the following tests are failing:
>  
> {{{}ITestS3ABlockOutputArray.testDiskBlockCreate{}}}{{{}ITestS3ABlockOutputByteBuffer>ITestS3ABlockOutputArray.testDiskBlockCreate{}}}{{{}ITestS3ABlockOutputDisk>ITestS3ABlockOutputArray.testDiskBlockCreate{}}}
>  
> with the error IO File name too long. 
>  
> The tests create a file with a 1024 char file name and rely on 
> File.createTempFile() to truncate the file name to < OS limit. 
>  
> Stack trace:
> {{Java.io.IOException: File name too long}}
> {{    at java.io.UnixFileSystem.createFileExclusively(Native Method)}}
> {{    at java.io.File.createTempFile(File.java:2063)}}
> {{    at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)}}
> {{    at 
> org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)}}
> {{    at 
> org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)}}
> {{    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
> {{    at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
> {{    at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18652) Path.suffix raises NullPointerException

2023-05-13 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722430#comment-17722430
 ] 

Viraj Jasani commented on HADOOP-18652:
---

no worries at all, feel free to create github pull-request as per your 
convenience!

> Path.suffix raises NullPointerException
> ---
>
> Key: HADOOP-18652
> URL: https://issues.apache.org/jira/browse/HADOOP-18652
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Patrick Grandjean
>Assignee: Viraj Jasani
>Priority: Minor
>
> Calling the Path.suffix method on root raises a NullPointerException. Tested 
> with hadoop-client-api 3.3.2
> Scenario:
> {code:java}
> import org.apache.hadoop.fs.*
> Path root = new Path("/")
> root.getParent == null  // true
> root.suffix("bar")  // NPE is raised
> {code}
> Stack:
> {code:none}
> 23/03/03 15:13:18 ERROR Uncaught throwable from user code: 
> java.lang.NullPointerException
>     at org.apache.hadoop.fs.Path.(Path.java:104)
>     at org.apache.hadoop.fs.Path.(Path.java:93)
>     at org.apache.hadoop.fs.Path.suffix(Path.java:361)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18652) Path.suffix raises NullPointerException

2023-05-13 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18652:
-

Assignee: (was: Viraj Jasani)

> Path.suffix raises NullPointerException
> ---
>
> Key: HADOOP-18652
> URL: https://issues.apache.org/jira/browse/HADOOP-18652
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Patrick Grandjean
>Priority: Minor
>
> Calling the Path.suffix method on root raises a NullPointerException. Tested 
> with hadoop-client-api 3.3.2
> Scenario:
> {code:java}
> import org.apache.hadoop.fs.*
> Path root = new Path("/")
> root.getParent == null  // true
> root.suffix("bar")  // NPE is raised
> {code}
> Stack:
> {code:none}
> 23/03/03 15:13:18 ERROR Uncaught throwable from user code: 
> java.lang.NullPointerException
>     at org.apache.hadoop.fs.Path.(Path.java:104)
>     at org.apache.hadoop.fs.Path.(Path.java:93)
>     at org.apache.hadoop.fs.Path.suffix(Path.java:361)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18652) Path.suffix raises NullPointerException

2023-05-12 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned HADOOP-18652:
-

Assignee: Viraj Jasani

> Path.suffix raises NullPointerException
> ---
>
> Key: HADOOP-18652
> URL: https://issues.apache.org/jira/browse/HADOOP-18652
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Patrick Grandjean
>Assignee: Viraj Jasani
>Priority: Minor
>
> Calling the Path.suffix method on root raises a NullPointerException. Tested 
> with hadoop-client-api 3.3.2
> Scenario:
> {code:java}
> import org.apache.hadoop.fs.*
> Path root = new Path("/")
> root.getParent == null  // true
> root.suffix("bar")  // NPE is raised
> {code}
> Stack:
> {code:none}
> 23/03/03 15:13:18 ERROR Uncaught throwable from user code: 
> java.lang.NullPointerException
>     at org.apache.hadoop.fs.Path.(Path.java:104)
>     at org.apache.hadoop.fs.Path.(Path.java:93)
>     at org.apache.hadoop.fs.Path.suffix(Path.java:361)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18291) SingleFilePerBlockCache does not have a limit

2023-05-11 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721927#comment-17721927
 ] 

Viraj Jasani commented on HADOOP-18291:
---

created HADOOP-18740 for cache file access to go through read-write locks.

> SingleFilePerBlockCache does not have a limit
> -
>
> Key: HADOOP-18291
> URL: https://issues.apache.org/jira/browse/HADOOP-18291
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0
>Reporter: Ahmar Suhail
>Assignee: Viraj Jasani
>Priority: Major
>
> Currently there is no limit on the size of disk cache. This means we could 
> have a large number of files on files, especially for access patterns that 
> are very random and do not always read the block fully. 
>  
> eg:
> in.seek(5);
> in.read(); 
> in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read
> in.read();
> in.seek(2 * blockSize + 10) // block 1 gets saved to disk
> .. and so on
>  
> The in memory cache is bounded, and by default has a limit of 72MB (9 
> blocks). When a block is fully read, and a seek is issued it's released 
> [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109].
>  We can also delete the on disk file for the block here if it exists. 
>  
> Also maybe add an upper limit on disk space, and delete the file which stores 
> data of the block furthest from the current block (similar to the in memory 
> cache) when this limit is reached. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   >