[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
[ https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846806#comment-17846806 ] Viraj Jasani commented on HADOOP-19148: --- Build is fine, dependency tree looks good (except it has zookeeper-jute transitive version coming as 3.6.2 instead of 3.8.4), let me create PR to run the whole build with tests. {code:java} [INFO] +- org.apache.solr:solr-solrj:jar:8.11.3:compile [INFO] | +- com.fasterxml.woodstox:woodstox-core:jar:5.4.0:compile [INFO] | +- commons-io:commons-io:jar:2.14.0:compile [INFO] | +- commons-lang:commons-lang:jar:2.6:compile [INFO] | +- io.netty:netty-buffer:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-codec:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-common:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-handler:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-resolver:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-transport:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-transport-native-epoll:jar:4.1.100.Final:compile [INFO] | +- io.netty:netty-transport-native-unix-common:jar:4.1.100.Final:compile [INFO] | +- org.apache.commons:commons-math3:jar:3.6.1:compile [INFO] | +- org.apache.httpcomponents:httpclient:jar:4.5.13:compile [INFO] | +- org.apache.httpcomponents:httpcore:jar:4.4.13:compile [INFO] | +- org.apache.httpcomponents:httpmime:jar:4.5.13:compile [INFO] | +- org.apache.zookeeper:zookeeper:jar:3.8.4:compile [INFO] | +- org.apache.zookeeper:zookeeper-jute:jar:3.6.2:compile [INFO] | +- org.codehaus.woodstox:stax2-api:jar:4.2.1:compile ... ... {code} > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 > --- > > Key: HADOOP-19148 > URL: https://issues.apache.org/jira/browse/HADOOP-19148 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Brahma Reddy Battula >Assignee: Viraj Jasani >Priority: Major > > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
[ https://issues.apache.org/jira/browse/HADOOP-19148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846458#comment-17846458 ] Viraj Jasani commented on HADOOP-19148: --- [~brahmareddy] is anyone picking this up? If not, let me create the PR? > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 > --- > > Key: HADOOP-19148 > URL: https://issues.apache.org/jira/browse/HADOOP-19148 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Brahma Reddy Battula >Priority: Major > > Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
[ https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19146: -- Component/s: test > noaa-cors-pds bucket access with global endpoint fails > -- > > Key: HADOOP-19146 > URL: https://issues.apache.org/jira/browse/HADOOP-19146 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > All tests accessing noaa-cors-pds use us-east-1 region, as configured at > bucket level. If global endpoint is configured (e.g. us-west-2), they fail to > access to bucket. > > Sample error: > {code:java} > org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect > response to region [us-east-1]. This likely indicates that the S3 region > configured in fs.s3a.endpoint.region does not match the AWS region containing > the bucket.: null (Service: S3, Status Code: 301, Request ID: > PMRWMQC9S91CNEJR, Extended Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) > at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) > at org.apache.hadoop.fs.Globber.glob(Globber.java:202) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) > {code} > {code:java} > Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null > (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended > Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) >
[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
Viraj Jasani created HADOOP-19146: - Summary: noaa-cors-pds bucket access with global endpoint fails Key: HADOOP-19146 URL: https://issues.apache.org/jira/browse/HADOOP-19146 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 3.4.0 Reporter: Viraj Jasani All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket level. If global endpoint is configured (e.g. us-west-2), they fail to access to bucket. Sample error: {code:java} org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect response to region [us-east-1]. This likely indicates that the S3 region configured in fs.s3a.endpoint.region does not match the AWS region containing the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) at org.apache.hadoop.fs.Globber.glob(Globber.java:202) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) at org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) {code} {code:java} Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93) at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279) ... ... ... at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
[jira] [Assigned] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
[ https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19146: - Assignee: Viraj Jasani > noaa-cors-pds bucket access with global endpoint fails > -- > > Key: HADOOP-19146 > URL: https://issues.apache.org/jira/browse/HADOOP-19146 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > All tests accessing noaa-cors-pds use us-east-1 region, as configured at > bucket level. If global endpoint is configured (e.g. us-west-2), they fail to > access to bucket. > > Sample error: > {code:java} > org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect > response to region [us-east-1]. This likely indicates that the S3 region > configured in fs.s3a.endpoint.region does not match the AWS region containing > the bucket.: null (Service: S3, Status Code: 301, Request ID: > PMRWMQC9S91CNEJR, Extended Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) > at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) > at org.apache.hadoop.fs.Globber.glob(Globber.java:202) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) > {code} > {code:java} > Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null > (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended > Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) > at >
[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17825912#comment-17825912 ] Viraj Jasani commented on HADOOP-19066: --- Addendum PR: [https://github.com/apache/hadoop/pull/6624] > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816513#comment-17816513 ] Viraj Jasani commented on HADOOP-18980: --- Addressed edge cases with addendum PR: [https://github.com/apache/hadoop/pull/6546] > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0, 3.5.0, 3.4.1 > > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19066: -- Status: Patch Available (was: In Progress) > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"
[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19072: - Assignee: Viraj Jasani > S3A: expand optimisations on stores with "fs.s3a.create.performance" > > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19072) S3A: expand optimisations on stores with "fs.s3a.create.performance"
[ https://issues.apache.org/jira/browse/HADOOP-19072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815822#comment-17815822 ] Viraj Jasani commented on HADOOP-19072: --- The improvement makes sense, as long as downstreamer knows where they are creating the dir. > S3A: expand optimisations on stores with "fs.s3a.create.performance" > > > Key: HADOOP-19072 > URL: https://issues.apache.org/jira/browse/HADOOP-19072 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Major > > on an s3a store with fs.s3a.create.performance set, speed up other operations > * mkdir to skip parent directory check: just do a HEAD to see if there's a > file at the target location -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814576#comment-17814576 ] Viraj Jasani commented on HADOOP-19066: --- Indeed! hopefully some final stabilization work. > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814171#comment-17814171 ] Viraj Jasani commented on HADOOP-19066: --- Will run the whole suite with FIPS support + central endpoint. > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
[ https://issues.apache.org/jira/browse/HADOOP-19066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19066: - Assignee: Viraj Jasani > AWS SDK V2 - Enabling FIPS should be allowed with central endpoint > -- > > Key: HADOOP-19066 > URL: https://issues.apache.org/jira/browse/HADOOP-19066 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.5.0, 3.4.1 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK > considers overriding endpoint and enabling fips as mutually exclusive, we > fail fast if fs.s3a.endpoint is set with fips support (details on > HADOOP-18975). > Now, we no longer override SDK endpoint for central endpoint since we enable > cross region access (details on HADOOP-19044) but we would still fail fast if > endpoint is central and fips is enabled. > Changes proposed: > * S3A to fail fast only if FIPS is enabled and non-central endpoint is > configured. > * Tests to ensure S3 bucket is accessible with default region us-east-2 with > cross region access (expected with central endpoint). > * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19066) AWS SDK V2 - Enabling FIPS should be allowed with central endpoint
Viraj Jasani created HADOOP-19066: - Summary: AWS SDK V2 - Enabling FIPS should be allowed with central endpoint Key: HADOOP-19066 URL: https://issues.apache.org/jira/browse/HADOOP-19066 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.5.0, 3.4.1 Reporter: Viraj Jasani FIPS support can be enabled by setting "fs.s3a.endpoint.fips". Since the SDK considers overriding endpoint and enabling fips as mutually exclusive, we fail fast if fs.s3a.endpoint is set with fips support (details on HADOOP-18975). Now, we no longer override SDK endpoint for central endpoint since we enable cross region access (details on HADOOP-19044) but we would still fail fast if endpoint is central and fips is enabled. Changes proposed: * S3A to fail fast only if FIPS is enabled and non-central endpoint is configured. * Tests to ensure S3 bucket is accessible with default region us-east-2 with cross region access (expected with central endpoint). * Document FIPS support with central endpoint on connecting.html. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812142#comment-17812142 ] Viraj Jasani commented on HADOOP-19022: --- It's fine [~ste...@apache.org], i anyways need to make some changes for updating cross region logic, so i can take care of that and also fixing timeout value for the current test (only if required after your PR [https://github.com/apache/hadoop/pull/6470)] and then add some more coverage. Once your PR gets merged and cross region logic part is also done, i will re-run this with different endpoint/region settings and if needed, i will take care of ITestS3AConfiguration issues as part of this Jira, otherwise will close the Jira. > S3A : ITestS3AConfiguration#testRequestTimeout failure > -- > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints
[ https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809636#comment-17809636 ] Viraj Jasani commented on HADOOP-18975: --- {quote}you must have set a global endpoint, rather than one for your test bucket -correct? {quote} Exactly. > AWS SDK v2: extend support for FIPS endpoints > -- > > Key: HADOOP-18975 > URL: https://issues.apache.org/jira/browse/HADOOP-18975 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > v1 SDK supported FIPS just by changing the endpoint. > Now we have a new builder setting to use. > * add new fs.s3a.endpoint.fips option > * pass it down > * test -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints
[ https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809271#comment-17809271 ] Viraj Jasani edited comment on HADOOP-18975 at 1/22/24 7:33 AM: {code:java} fs.s3a.bucket.landsat-pds.endpoint.fips true Use the fips endpoint {code} [~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ? This is breaking several tests from full suite that i am running against us-west-2 for PR [https://github.com/apache/hadoop/pull/6479] e.g. {code:java} [ERROR] testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select) Time elapsed: 2.917 s <<< ERROR! java.lang.IllegalArgumentException: An endpoint cannot set when fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com at org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366) at org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304) at org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) {code} [ERROR] Tests run: 1264, Failures: 4, Errors: 87, Skipped: 164 was (Author: vjasani): {code:java} fs.s3a.bucket.landsat-pds.endpoint.fips true Use the fips endpoint {code} [~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ? This is breaking several tests from full suite that i am running against us-west-2 for PR [https://github.com/apache/hadoop/pull/6479] e.g. {code:java} [ERROR] testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select) Time elapsed: 2.917 s <<< ERROR! java.lang.IllegalArgumentException: An endpoint cannot set when fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com at org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366) at org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304) at org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) {code} > AWS SDK v2: extend support for FIPS endpoints > -- > > Key: HADOOP-18975 > URL: https://issues.apache.org/jira/browse/HADOOP-18975 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > v1 SDK supported FIPS just by changing the endpoint. > Now we have a new builder setting to use. > * add new fs.s3a.endpoint.fips option > * pass it down > * test -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HADOOP-18975) AWS SDK v2: extend support for FIPS endpoints
[ https://issues.apache.org/jira/browse/HADOOP-18975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17809271#comment-17809271 ] Viraj Jasani commented on HADOOP-18975: --- {code:java} fs.s3a.bucket.landsat-pds.endpoint.fips true Use the fips endpoint {code} [~ste...@apache.org] [~ahmar] do we really need fips enabled for landsat in hadoop-tools/hadoop-aws/src/test/resources/core-site.xml ? This is breaking several tests from full suite that i am running against us-west-2 for PR [https://github.com/apache/hadoop/pull/6479] e.g. {code:java} [ERROR] testSelectOddRecordsIgnoreHeaderV1(org.apache.hadoop.fs.s3a.select.ITestS3Select) Time elapsed: 2.917 s <<< ERROR! java.lang.IllegalArgumentException: An endpoint cannot set when fs.s3a.endpoint.fips is true : https://s3-us-west-2.amazonaws.com at org.apache.hadoop.util.Preconditions.checkArgument(Preconditions.java:213) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureEndpointAndRegion(DefaultS3ClientFactory.java:292) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.configureClientBuilder(DefaultS3ClientFactory.java:179) at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:126) at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:1063) at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:677) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3601) at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:171) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3702) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3653) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:555) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:366) at org.apache.hadoop.fs.s3a.select.AbstractS3SelectTest.setup(AbstractS3SelectTest.java:304) at org.apache.hadoop.fs.s3a.select.ITestS3Select.setup(ITestS3Select.java:112) {code} > AWS SDK v2: extend support for FIPS endpoints > -- > > Key: HADOOP-18975 > URL: https://issues.apache.org/jira/browse/HADOOP-18975 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > > v1 SDK supported FIPS just by changing the endpoint. > Now we have a new builder setting to use. > * add new fs.s3a.endpoint.fips option > * pass it down > * test -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-19044) AWS SDK V2 - Update S3A region logic
[ https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-19044: - Assignee: Viraj Jasani > AWS SDK V2 - Update S3A region logic > - > > Key: HADOOP-19044 > URL: https://issues.apache.org/jira/browse/HADOOP-19044 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > > If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set > fs.s3a.endpoint to > s3.amazonaws.com here: > [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540] > > > HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is > set, or if a region can be parsed from fs.s3a.endpoint (which will happen in > this case, region will be US_EAST_1), cross region access is not enabled. > This will cause 400 errors if the bucket is not in US_EAST_1. > > Proposed: Updated the logic so that if the endpoint is the global > s3.amazonaws.com , cross region access is enabled. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804132#comment-17804132 ] Viraj Jasani commented on HADOOP-19023: --- {quote} * make sure you've not got a site config with an aggressive timeout{quote} Can confirm that this is not the case. {quote} * do set version/component in the issue fields...it's not picked up from the parent{quote} Sure, will keep this in mind. While HADOOP-19022 has test failure that is consistent, this one testParallelRename is intermediate failure. It happened only when I ran the whole suite (-Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch), when the setup was connected to VPN. Running the test individually is not failing. Since testParallelRename is already aggressive, I think we might want to set higher connection timeout for the test. > S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure > --- > > Key: HADOOP-19023 > URL: https://issues.apache.org/jira/browse/HADOOP-19023 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Major > > Need to configure higher timeout for the test. > > {code:java} > [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 256.281 s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps > [ERROR] > testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) > Time elapsed: 72.565 s <<< ERROR! > org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on > fork-0005/test/testParallelRename-source0: > software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client > execution did not complete before the specified timeout configuration: 15000 > millis > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) > at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: > Client execution did not complete before the specified timeout configuration: > 15000 millis > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) > at >
[jira] [Commented] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17804129#comment-17804129 ] Viraj Jasani commented on HADOOP-19022: --- {quote}have you explicitly set it in your site config? {quote} Can confirm that it is not set explicitly, this test fails consistently because it takes 120 as 120 ms by default, and since it is less than 15 s, so 15s is selected: {code:java} apiCallTimeout = enforceMinimumDuration(REQUEST_TIMEOUT, apiCallTimeout, minimumOperationDuration); {code} Here, minimumOperationDuration is 15s. For this Jira, we can # Make the test use "120s" instead of "120" so that it will not set 15s by default. # Add a test with timeout value smaller than 15s and verify that actual timeout in S3A client config object is 15s. # Add a test by setting "0" as timeout and verify that SdkClientOption.API_CALL_ATTEMPT_TIMEOUT does not even get set. # Document "fs.s3a.connection.request.timeout" as having 15s default behavior if any client sets it with value > 0 and < 15s. WDYT? > S3A : ITestS3AConfiguration#testRequestTimeout failure > -- > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19023: -- Component/s: test > ITestS3AConcurrentOps#testParallelRename intermittent timeout failure > - > > Key: HADOOP-19023 > URL: https://issues.apache.org/jira/browse/HADOOP-19023 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Major > > Need to configure higher timeout for the test. > > {code:java} > [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 256.281 s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps > [ERROR] > testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) > Time elapsed: 72.565 s <<< ERROR! > org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on > fork-0005/test/testParallelRename-source0: > software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client > execution did not complete before the specified timeout configuration: 15000 > millis > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) > at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: > Client execution did not complete before the specified timeout configuration: > 15000 millis > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at >
[jira] [Updated] (HADOOP-19022) S3A : ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19022: -- Summary: S3A : ITestS3AConfiguration#testRequestTimeout failure (was: ITestS3AConfiguration#testRequestTimeout failure) > S3A : ITestS3AConfiguration#testRequestTimeout failure > -- > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-19023) S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-19023: -- Summary: S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure (was: ITestS3AConcurrentOps#testParallelRename intermittent timeout failure) > S3A : ITestS3AConcurrentOps#testParallelRename intermittent timeout failure > --- > > Key: HADOOP-19023 > URL: https://issues.apache.org/jira/browse/HADOOP-19023 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Priority: Major > > Need to configure higher timeout for the test. > > {code:java} > [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 256.281 s <<< FAILURE! - in > org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps > [ERROR] > testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) > Time elapsed: 72.565 s <<< ERROR! > org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on > fork-0005/test/testParallelRename-source0: > software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client > execution did not complete before the specified timeout configuration: 15000 > millis > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) > at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) > at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) > at > org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) > at > org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: > Client execution did not complete before the specified timeout configuration: > 15000 millis > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) > at > software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) > at > software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) > at > software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) > at >
[jira] [Updated] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18980: -- Status: Patch Available (was: In Progress) > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18959) Use builder for prefetch CachingBlockManager
[ https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17803401#comment-17803401 ] Viraj Jasani commented on HADOOP-18959: --- [~slfan1989] this is already committed to trunk, only backport PR is pending for merge. > Use builder for prefetch CachingBlockManager > > > Key: HADOOP-18959 > URL: https://issues.apache.org/jira/browse/HADOOP-18959 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) > have added more params for prefetch CachingBlockManager c'tor to process > read/write block requests. They have added too many params and more are > likely to be introduced later. We should use builder pattern to pass params. > This would also help consolidating required prefetch params into one single > place within S3ACachingInputStream, from scattered locations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19023) ITestS3AConcurrentOps#testParallelRename intermittent timeout failure
Viraj Jasani created HADOOP-19023: - Summary: ITestS3AConcurrentOps#testParallelRename intermittent timeout failure Key: HADOOP-19023 URL: https://issues.apache.org/jira/browse/HADOOP-19023 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Need to configure higher timeout for the test. {code:java} [ERROR] Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 256.281 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps [ERROR] testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) Time elapsed: 72.565 s <<< ERROR! org.apache.hadoop.fs.s3a.AWSApiCallTimeoutException: Writing Object on fork-0005/test/testParallelRename-source0: software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution did not complete before the specified timeout configuration: 15000 millis at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215) at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) at org.apache.hadoop.fs.s3a.WriteOperationHelper.retry(WriteOperationHelper.java:214) at org.apache.hadoop.fs.s3a.WriteOperationHelper.putObject(WriteOperationHelper.java:532) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.lambda$putObject$0(S3ABlockOutputStream.java:620) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) at org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:225) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: software.amazon.awssdk.core.exception.ApiCallTimeoutException: Client execution did not complete before the specified timeout configuration: 15000 millis at software.amazon.awssdk.core.exception.ApiCallTimeoutException$BuilderImpl.build(ApiCallTimeoutException.java:97) at software.amazon.awssdk.core.exception.ApiCallTimeoutException.create(ApiCallTimeoutException.java:38) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.generateApiCallTimeoutException(ApiCallTimeoutTrackingStage.java:151) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.handleInterruptedException(ApiCallTimeoutTrackingStage.java:139) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.translatePipelineException(ApiCallTimeoutTrackingStage.java:107) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:224) at
[jira] [Commented] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure
[ https://issues.apache.org/jira/browse/HADOOP-19022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17802395#comment-17802395 ] Viraj Jasani commented on HADOOP-19022: --- It's small test, but perhaps good to cover both cases: more than 15s and less than 15s timeouts. > ITestS3AConfiguration#testRequestTimeout failure > > > Key: HADOOP-19022 > URL: https://issues.apache.org/jira/browse/HADOOP-19022 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Priority: Minor > > "fs.s3a.connection.request.timeout" should be specified in milliseconds as per > {code:java} > Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, > DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); > {code} > The test fails consistently because it sets 120 ms timeout which is less than > 15s (min network operation duration), and hence gets reset to 15000 ms based > on the enforcement. > > {code:java} > [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) > Time elapsed: 0.016 s <<< FAILURE! > java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is > different than what AWS sdk configuration uses internally expected:<12> > but was:<15000> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at > org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19022) ITestS3AConfiguration#testRequestTimeout failure
Viraj Jasani created HADOOP-19022: - Summary: ITestS3AConfiguration#testRequestTimeout failure Key: HADOOP-19022 URL: https://issues.apache.org/jira/browse/HADOOP-19022 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani "fs.s3a.connection.request.timeout" should be specified in milliseconds as per {code:java} Duration apiCallTimeout = getDuration(conf, REQUEST_TIMEOUT, DEFAULT_REQUEST_TIMEOUT_DURATION, TimeUnit.MILLISECONDS, Duration.ZERO); {code} The test fails consistently because it sets 120 ms timeout which is less than 15s (min network operation duration), and hence gets reset to 15000 ms based on the enforcement. {code:java} [ERROR] testRequestTimeout(org.apache.hadoop.fs.s3a.ITestS3AConfiguration) Time elapsed: 0.016 s <<< FAILURE! java.lang.AssertionError: Configured fs.s3a.connection.request.timeout is different than what AWS sdk configuration uses internally expected:<12> but was:<15000> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.apache.hadoop.fs.s3a.ITestS3AConfiguration.testRequestTimeout(ITestS3AConfiguration.java:444) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790816#comment-17790816 ] Viraj Jasani commented on HADOOP-18991: --- As per HADOOP-16542, if we remove this, hive build fails. Hive can explicitly use common-beanutils directly? FYI [~weichiu] > Remove commons-benautils dependency from Hadoop 3 > - > > Key: HADOOP-18991 > URL: https://issues.apache.org/jira/browse/HADOOP-18991 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Istvan Toth >Priority: Major > > Hadoop doesn't acually use it, and it pollutes the classpath of dependent > projects. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18991) Remove commons-benautils dependency from Hadoop 3
[ https://issues.apache.org/jira/browse/HADOOP-18991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790788#comment-17790788 ] Viraj Jasani commented on HADOOP-18991: --- [~stoty] is this the cause for managing it in phoenix even after excluding it from omid? > Remove commons-benautils dependency from Hadoop 3 > - > > Key: HADOOP-18991 > URL: https://issues.apache.org/jira/browse/HADOOP-18991 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Reporter: Istvan Toth >Priority: Major > > Hadoop doesn't acually use it, and it pollutes the classpath of dependent > projects. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788128#comment-17788128 ] Viraj Jasani commented on HADOOP-18980: --- {quote}exactly; though i'd expect the remapping to be from com.amazonaws to software.amazonaws or private implementations key goal: you can use the same credentials.provider list for v1 and v2 sdk clients. {quote} In addition to having same credentials.provider list for v1 and v2 sdk, maybe we can also remove static mapping for v1 to v2 credential providers and let new config have default key value pairs: {code:java} fs.s3a.aws.credentials.provider.mapping com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider {code} With this being default value, any new third-party credential provider can be added to this list by users. Does that sound good? > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Minor > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788128#comment-17788128 ] Viraj Jasani edited comment on HADOOP-18980 at 11/20/23 6:44 PM: - In addition to having same credentials.provider list for v1 and v2 sdk, maybe we can also remove static mapping for v1 to v2 credential providers and let new config have default key value pairs: {code:java} fs.s3a.aws.credentials.provider.mapping com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider {code} With this being default value, any new third-party credential provider can be added to this list by users. Does that sound good? was (Author: vjasani): {quote}exactly; though i'd expect the remapping to be from com.amazonaws to software.amazonaws or private implementations key goal: you can use the same credentials.provider list for v1 and v2 sdk clients. {quote} In addition to having same credentials.provider list for v1 and v2 sdk, maybe we can also remove static mapping for v1 to v2 credential providers and let new config have default key value pairs: {code:java} fs.s3a.aws.credentials.provider.mapping com.amazonaws.auth.AnonymousAWSCredentials=org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider, com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider=org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider=software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.profile.ProfileCredentialsProvider=software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider {code} With this being default value, any new third-party credential provider can be added to this list by users. Does that sound good? > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Minor > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18980) S3A credential provider remapping: make extensible
[ https://issues.apache.org/jira/browse/HADOOP-18980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787854#comment-17787854 ] Viraj Jasani commented on HADOOP-18980: --- Something like this maybe? {code:java} fs.s3a.aws.credentials.provider.mapping com.amazon.xyz.auth.provider.key1=org.apache.hadoop.fs.s3a.CustomCredsProvider1, com.amazon.xyz.auth.provider.key2=org.apache.hadoop.fs.s3a.CustomCredsProvider2, com.amazon.xyz.auth.provider.key3=org.apache.hadoop.fs.s3a.CustomCredsProvider3 fs.s3a.aws.credentials.provider com.amazon.xyz.auth.provider.key1, com.amazon.xyz.auth.provider.key2 {code} > S3A credential provider remapping: make extensible > -- > > Key: HADOOP-18980 > URL: https://issues.apache.org/jira/browse/HADOOP-18980 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Priority: Minor > > s3afs will now remap the common com.amazonaws credential providers to > equivalents in the v2 sdk or in hadoop-aws > We could do the same for third party credential providers by taking a > key=value list in a configuration property and adding to the map. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18959) Use builder for prefetch CachingBlockManager
[ https://issues.apache.org/jira/browse/HADOOP-18959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18959: - Assignee: Viraj Jasani > Use builder for prefetch CachingBlockManager > > > Key: HADOOP-18959 > URL: https://issues.apache.org/jira/browse/HADOOP-18959 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) > have added more params for prefetch CachingBlockManager c'tor to process > read/write block requests. They have added too many params and more are > likely to be introduced later. We should use builder pattern to pass params. > This would also help consolidating required prefetch params into one single > place within S3ACachingInputStream, from scattered locations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18959) Use builder for prefetch CachingBlockManager
Viraj Jasani created HADOOP-18959: - Summary: Use builder for prefetch CachingBlockManager Key: HADOOP-18959 URL: https://issues.apache.org/jira/browse/HADOOP-18959 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Some of the recent changes (HADOOP-18399, HADOOP-18291, HADOOP-18829 etc) have added more params for prefetch CachingBlockManager c'tor to process read/write block requests. They have added too many params and more are likely to be introduced later. We should use builder pattern to pass params. This would also help consolidating required prefetch params into one single place within S3ACachingInputStream, from scattered locations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) > ITestS3GuardTool fails if SSE/DSSE encryption is used > - > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18952) FsCommand Stat class set the timeZone"UTC", which is different from the machine's timeZone
[ https://issues.apache.org/jira/browse/HADOOP-18952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17780014#comment-17780014 ] Viraj Jasani commented on HADOOP-18952: --- This has been the case since the beginning: Stat: {code:java} protected final SimpleDateFormat timeFmt; { timeFmt = new SimpleDateFormat("-MM-dd HH:mm:ss"); timeFmt.setTimeZone(TimeZone.getTimeZone("UTC")); }{code} Ls: {code:java} protected final SimpleDateFormat dateFormat = new SimpleDateFormat("-MM-dd HH:mm"); {code} > FsCommand Stat class set the timeZone"UTC", which is different from the > machine's timeZone > -- > > Key: HADOOP-18952 > URL: https://issues.apache.org/jira/browse/HADOOP-18952 > Project: Hadoop Common > Issue Type: Bug > Environment: Using Hadoop 3.3.4-release >Reporter: liang yu >Priority: Major > Attachments: image-2023-10-26-10-07-11-637.png > > > Using Hadoop version 3.3.4 > > When executing Ls command and Stat command on the same hadoop file, I get two > timestamps. > > {code:java} > hdfs dfs -stat "modify_time %y, access_time%x" /path/to/file{code} > returns: > modify_time {_}*2023-10-17 01:43:05*{_}, access_time _*2023-10-17 01:41:00*_ > > {code:java} > hdfs dfs -ls /path/to/file{code} > returns: > {-}rw{-}rw-r–+ 3 user_name user_group 247400339 > _*2023-10-17 09:43*_ /path/to/file > > these two timestamps has the difference 8hours. > I am in China, the timezone is “UTC+8”, so the timestamp from LS command is > correct and timestamp from STAT command is wrong. > > !image-2023-10-26-10-07-11-637.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18829) s3a prefetch LRU cache eviction metric
[ https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani resolved HADOOP-18829. --- Fix Version/s: 3.4.0 3.3.9 Hadoop Flags: Reviewed Resolution: Fixed > s3a prefetch LRU cache eviction metric > -- > > Key: HADOOP-18829 > URL: https://issues.apache.org/jira/browse/HADOOP-18829 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > Follow-up from HADOOP-18291: > Add new IO statistics metric to capture s3a prefetch LRU cache eviction. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from
[ https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776356#comment-17776356 ] Viraj Jasani edited comment on HADOOP-18931 at 10/17/23 7:16 PM: - sounds good, it makes sense to log for all fs invocation by keeping the log separate from the heavy service load. was (Author: vjasani): sounds good, it makes sense to log for all fs invocation > FileSystem.getFileSystemClass() to log at debug the jar the .class came from > > > Key: HADOOP-18931 > URL: https://issues.apache.org/jira/browse/HADOOP-18931 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Minor > > we want to be able to log the jar the filesystem implementation class, so > that we can identify which version of a module the class came from. > this is to help track down problems where different machines in the cluster > or the .tar.gz bundle is out of date. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from
[ https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776356#comment-17776356 ] Viraj Jasani commented on HADOOP-18931: --- sounds good, it makes sense to log for all fs invocation > FileSystem.getFileSystemClass() to log at debug the jar the .class came from > > > Key: HADOOP-18931 > URL: https://issues.apache.org/jira/browse/HADOOP-18931 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Minor > > we want to be able to log the jar the filesystem implementation class, so > that we can identify which version of a module the class came from. > this is to help track down problems where different machines in the cluster > or the .tar.gz bundle is out of date. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Status: Patch Available (was: In Progress) > ITestS3GuardTool fails if SSE/DSSE encryption is used > - > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18850: -- Status: Patch Available (was: In Progress) > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18931) FileSystem.getFileSystemClass() to log at debug the jar the .class came from
[ https://issues.apache.org/jira/browse/HADOOP-18931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17775281#comment-17775281 ] Viraj Jasani commented on HADOOP-18931: --- i thought we were already logging it during the first time init of fs for the given JVM {code:java} try { SERVICE_FILE_SYSTEMS.put(fs.getScheme(), fs.getClass()); if (LOGGER.isDebugEnabled()) { LOGGER.debug("{}:// = {} from {}", fs.getScheme(), fs.getClass(), ClassUtil.findContainingJar(fs.getClass())); } } catch (Exception e) { LOGGER.warn("Cannot load: {} from {}", fs, ClassUtil.findContainingJar(fs.getClass())); LOGGER.info("Full exception loading: {}", fs, e); } {code} maybe you are suggesting that we should log it for every call to {_}getFileSystemClass(){_}, correct? > FileSystem.getFileSystemClass() to log at debug the jar the .class came from > > > Key: HADOOP-18931 > URL: https://issues.apache.org/jira/browse/HADOOP-18931 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Minor > > we want to be able to log the jar the filesystem implementation class, so > that we can identify which version of a module the class came from. > this is to help track down problems where different machines in the cluster > or the .tar.gz bundle is out of date. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE/DSSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Summary: ITestS3GuardTool fails if SSE/DSSE encryption is used (was: ITestS3GuardTool fails if SSE encryption is used) > ITestS3GuardTool fails if SSE/DSSE encryption is used > - > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18918: -- Priority: Minor (was: Major) > ITestS3GuardTool fails if SSE encryption is used > > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used
[ https://issues.apache.org/jira/browse/HADOOP-18918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18918: - Assignee: Viraj Jasani > ITestS3GuardTool fails if SSE encryption is used > > > Key: HADOOP-18918 > URL: https://issues.apache.org/jira/browse/HADOOP-18918 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.6 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > {code:java} > [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool > [ERROR] > testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) > Time elapsed: 0.807 s <<< ERROR! > 46: Bucket s3a://landsat-pds: required encryption is none but actual > encryption is DSSE-KMS > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) > at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) > at > org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) > at > org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) > at > org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:750) > {code} > Since landsat requires none encryption, the test should be skipped for any > encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18918) ITestS3GuardTool fails if SSE encryption is used
Viraj Jasani created HADOOP-18918: - Summary: ITestS3GuardTool fails if SSE encryption is used Key: HADOOP-18918 URL: https://issues.apache.org/jira/browse/HADOOP-18918 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3, test Affects Versions: 3.3.6 Reporter: Viraj Jasani {code:java} [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 25.989 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool [ERROR] testLandsatBucketRequireUnencrypted(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool) Time elapsed: 0.807 s <<< ERROR! 46: Bucket s3a://landsat-pds: required encryption is none but actual encryption is DSSE-KMS at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.exitException(S3GuardTool.java:915) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.badState(S3GuardTool.java:881) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo.run(S3GuardTool.java:511) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:283) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) at org.apache.hadoop.fs.s3a.s3guard.S3GuardTool.run(S3GuardTool.java:963) at org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand(S3GuardToolTestHelper.java:147) at org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.run(AbstractS3GuardToolTestBase.java:114) at org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardTool.testLandsatBucketRequireUnencrypted(ITestS3GuardTool.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750) {code} Since landsat requires none encryption, the test should be skipped for any encryption algorithm. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18850: - Assignee: Viraj Jasani > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Assignee: Viraj Jasani >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18915) HTTP timeouts are not set correctly
[ https://issues.apache.org/jira/browse/HADOOP-18915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770745#comment-17770745 ] Viraj Jasani commented on HADOOP-18915: --- Nice find! > HTTP timeouts are not set correctly > --- > > Key: HADOOP-18915 > URL: https://issues.apache.org/jira/browse/HADOOP-18915 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Priority: Major > > In the client config builders, when [setting > timeouts|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/AWSClientConfig.java#L120], > it uses Duration.ofSeconds(), configs all use milliseconds so this needs to > be updated to Duration.ofMillis(). > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18208) Remove all the log4j reference in modules other than hadoop-logging
[ https://issues.apache.org/jira/browse/HADOOP-18208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18208: - Assignee: (was: Viraj Jasani) > Remove all the log4j reference in modules other than hadoop-logging > --- > > Key: HADOOP-18208 > URL: https://issues.apache.org/jira/browse/HADOOP-18208 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16206) Migrate from Log4j1 to Log4j2
[ https://issues.apache.org/jira/browse/HADOOP-16206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-16206: - Assignee: (was: Viraj Jasani) > Migrate from Log4j1 to Log4j2 > - > > Key: HADOOP-16206 > URL: https://issues.apache.org/jira/browse/HADOOP-16206 > Project: Hadoop Common > Issue Type: Task >Affects Versions: 3.3.0 >Reporter: Akira Ajisaka >Priority: Major > Attachments: HADOOP-16206-wip.001.patch > > > This sub-task is to remove log4j1 dependency and add log4j2 dependency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18207) Introduce hadoop-logging module
[ https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18207: - Assignee: (was: Viraj Jasani) > Introduce hadoop-logging module > --- > > Key: HADOOP-18207 > URL: https://issues.apache.org/jira/browse/HADOOP-18207 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There are several goals here: > # Provide the ability to change log level, get log level, etc. > # Place all the appender implementation(?) > # Hide the real logging implementation. > # Later we could remove all the log4j references in other hadoop module. > # Move as much log4j usage to the module as possible. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-15984) Update jersey from 1.19 to 2.x
[ https://issues.apache.org/jira/browse/HADOOP-15984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-15984: - Assignee: (was: Viraj Jasani) > Update jersey from 1.19 to 2.x > -- > > Key: HADOOP-15984 > URL: https://issues.apache.org/jira/browse/HADOOP-15984 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > jersey-json 1.19 depends on Jackson 1.9.2. Let's upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755406#comment-17755406 ] Viraj Jasani commented on HADOOP-18850: --- [~ste...@apache.org] are you in favor of this before v2 sdk upgrade? > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755394#comment-17755394 ] Viraj Jasani commented on HADOOP-18850: --- only recently HADOOP-18832 bumped sdk bundle to 1.12.499, so looks like we can support this > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755392#comment-17755392 ] Viraj Jasani edited comment on HADOOP-18850 at 8/17/23 7:13 AM: it seems SSEAlgorithm added DSSE as part of 1.12.488 release: [https://github.com/aws/aws-sdk-java/releases/tag/1.12.488] {code:java} public enum SSEAlgorithm { AES256("AES256"), KMS("aws:kms"), DSSE("aws:kms:dsse"), ;{code} was (Author: vjasani): SSEAlgorithm added DSSE as part of 1.12.488 release: [https://github.com/aws/aws-sdk-java/releases/tag/1.12.488] {code:java} public enum SSEAlgorithm { AES256("AES256"), KMS("aws:kms"), DSSE("aws:kms:dsse"), ;{code} > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18850) Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS)
[ https://issues.apache.org/jira/browse/HADOOP-18850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755392#comment-17755392 ] Viraj Jasani commented on HADOOP-18850: --- SSEAlgorithm added DSSE as part of 1.12.488 release: [https://github.com/aws/aws-sdk-java/releases/tag/1.12.488] {code:java} public enum SSEAlgorithm { AES256("AES256"), KMS("aws:kms"), DSSE("aws:kms:dsse"), ;{code} > Enable dual-layer server-side encryption with AWS KMS keys (DSSE-KMS) > - > > Key: HADOOP-18850 > URL: https://issues.apache.org/jira/browse/HADOOP-18850 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, security >Reporter: Akira Ajisaka >Priority: Major > > Add support for DSSE-KMS > https://docs.aws.amazon.com/AmazonS3/latest/userguide/specifying-dsse-encryption.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO
[ https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755385#comment-17755385 ] Viraj Jasani commented on HADOOP-18852: --- {quote}for other reads, we may want a bigger prefech count than 1, depending on: split start/end, file read policy (random, sequential, whole-file) {quote} this means we first need prefetch read policy (HADOOP-18791), correct? > S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look > like random IO > -- > > Key: HADOOP-18852 > URL: https://issues.apache.org/jira/browse/HADOOP-18852 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Major > > noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with > separately. > # all seeks are lazy; no fetching is kicked off after an open > # the first read is treated as an out of order read, so cancels any active > reads (don't think there are any) and then only asks for 1 block > {code} > if (outOfOrderRead) { > LOG.debug("lazy-seek({})", getOffsetStr(readPos)); > blockManager.cancelPrefetches(); > // We prefetch only 1 block immediately after a seek operation. > prefetchCount = 1; > } > {code} > * for any read fully we should prefetch all blocks in the range requested > * for other reads, we may want a bigger prefech count than 1, depending on: > split start/end, file read policy (random, sequential, whole-file) > * also, if a read is in a block other than the current one, but which is > already being fetched or cached, is this really an OOO read to the extent > that outstanding fetches should be cancelled? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18852) S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look like random IO
[ https://issues.apache.org/jira/browse/HADOOP-18852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17755384#comment-17755384 ] Viraj Jasani commented on HADOOP-18852: --- {quote}also, if a read is in a block other than the current one, but which is already being fetched or cached, is this really an OOO read to the extent that outstanding fetches should be cancelled? {quote} +1 to this, now that i checked some logs, can see lazy-seek for every first seek + read on the given block: {code:java} DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(0:0) DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(4:40960) DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(3:30720) DEBUG prefetch.S3ACachingInputStream (S3ACachingInputStream.java:ensureCurrentBuffer(141)) - lazy-seek(2:20480){code} but it's also valid that if the block was being cached, why cancel the outstanding fetches. > S3ACachingInputStream.ensureCurrentBuffer(): lazy seek means all reads look > like random IO > -- > > Key: HADOOP-18852 > URL: https://issues.apache.org/jira/browse/HADOOP-18852 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Major > > noticed in HADOOP-18184, but I think it's a big enough issue to be dealt with > separately. > # all seeks are lazy; no fetching is kicked off after an open > # the first read is treated as an out of order read, so cancels any active > reads (don't think there are any) and then only asks for 1 block > {code} > if (outOfOrderRead) { > LOG.debug("lazy-seek({})", getOffsetStr(readPos)); > blockManager.cancelPrefetches(); > // We prefetch only 1 block immediately after a seek operation. > prefetchCount = 1; > } > {code} > * for any read fully we should prefetch all blocks in the range requested > * for other reads, we may want a bigger prefech count than 1, depending on: > split start/end, file read policy (random, sequential, whole-file) > * also, if a read is in a block other than the current one, but which is > already being fetched or cached, is this really an OOO read to the extent > that outstanding fetches should be cancelled? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18829) s3a prefetch LRU cache eviction metric
[ https://issues.apache.org/jira/browse/HADOOP-18829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17750035#comment-17750035 ] Viraj Jasani commented on HADOOP-18829: --- sure thing, i think this can wait for sure. thanks > s3a prefetch LRU cache eviction metric > -- > > Key: HADOOP-18829 > URL: https://issues.apache.org/jira/browse/HADOOP-18829 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Follow-up from HADOOP-18291: > Add new IO statistics metric to capture s3a prefetch LRU cache eviction. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748981#comment-17748981 ] Viraj Jasani commented on HADOOP-18832: --- ITestS3AFileContextStatistics#testStatistics is flaky: {code:java} [ERROR] Tests run: 3, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 3.983 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics [ERROR] testStatistics(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextStatistics) Time elapsed: 1.776 s <<< FAILURE! java.lang.AssertionError: expected:<512> but was:<448> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at org.apache.hadoop.fs.FCStatisticsBaseTest.testStatistics(FCStatisticsBaseTest.java:108) {code} This only happened once, now unable to reproduce it locally. > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748980#comment-17748980 ] Viraj Jasani commented on HADOOP-18832: --- Testing in progress: Test results look good with -scale and -prefetch so far. Now running some encryption tests (bucket with algo: SSE-KMS). > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18832: -- Description: aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up in security CVE scans (CVE-2023-34462). The safe version for netty is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ (was: aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up in security CVE scans (CVE-2023-34462). The safe version for netty is 4.1.94.Final and this is used by aws-java-adk:1.12.499+) > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-sdk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
[ https://issues.apache.org/jira/browse/HADOOP-18832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18832: - Assignee: Viraj Jasani > Upgrade aws-java-sdk to 1.12.499+ > - > > Key: HADOOP-18832 > URL: https://issues.apache.org/jira/browse/HADOOP-18832 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence > showing up in security CVE scans (CVE-2023-34462). The safe version for netty > is 4.1.94.Final and this is used by aws-java-adk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18832) Upgrade aws-java-sdk to 1.12.499+
Viraj Jasani created HADOOP-18832: - Summary: Upgrade aws-java-sdk to 1.12.499+ Key: HADOOP-18832 URL: https://issues.apache.org/jira/browse/HADOOP-18832 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Viraj Jasani aws sdk versions < 1.12.499 uses a vulnerable version of netty and hence showing up in security CVE scans (CVE-2023-34462). The safe version for netty is 4.1.94.Final and this is used by aws-java-adk:1.12.499+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18829) s3a prefetch LRU cache eviction metric
Viraj Jasani created HADOOP-18829: - Summary: s3a prefetch LRU cache eviction metric Key: HADOOP-18829 URL: https://issues.apache.org/jira/browse/HADOOP-18829 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Assignee: Viraj Jasani Follow-up from HADOOP-18291: Add new IO statistics metric to capture s3a prefetch LRU cache eviction. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-18809) s3a prefetch read/write file operations should guard channel close
Viraj Jasani created HADOOP-18809: - Summary: s3a prefetch read/write file operations should guard channel close Key: HADOOP-18809 URL: https://issues.apache.org/jira/browse/HADOOP-18809 Project: Hadoop Common Issue Type: Sub-task Reporter: Viraj Jasani Assignee: Viraj Jasani As per Steve's suggestion from s3a prefetch LRU cache, s3a prefetch disk based cache file read and write operations should guard against close of FileChannel and WritableByteChannel, close them even if read/write operations throw IOException. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743344#comment-17743344 ] Viraj Jasani edited comment on HADOOP-18805 at 7/17/23 8:15 PM: sorry Steve, i was not aware you already created this Jira, i created PR for letting LRU tests use small files rather than landsat: [https://github.com/apache/hadoop/pull/5851] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). i can take this up as new sub-task and for the current Jira, we can focus on tests using small files for the better break-down? PR review discussion: [https://github.com/apache/hadoop/pull/5754#discussion_r1247476231] was (Author: vjasani): sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with? * the above addendum PR for using small file in the test (so that we don't need to put the test under -scale) * this Jira to refactor Entry and allowing a UT to test the contents of the linked list if you think above PR is not good for an addendum and should rather be linked to this Jira, i can change PR title to reflect this Jira number and i can create another sub-task to write simple UT that can test contents of the linked list from head to tail. > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Priority: Major > Labels: pull-request-available > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743344#comment-17743344 ] Viraj Jasani edited comment on HADOOP-18805 at 7/15/23 6:48 AM: sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with? * the above addendum PR for using small file in the test (so that we don't need to put the test under -scale) * this Jira to refactor Entry and allowing a UT to test the contents of the linked list if you think above PR is not good for an addendum and should rather be linked to this Jira, i can change PR title to reflect this Jira number and i can create another sub-task to write simple UT that can test contents of the linked list from head to tail. was (Author: vjasani): sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with the above addendum PR taking care of using small file in the test (so that we don't need to put the test under -scale) and this Jira being used for refactoring Entry and allowing a UT to test the contents of the linked list? > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Priority: Major > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17743344#comment-17743344 ] Viraj Jasani commented on HADOOP-18805: --- sorry Steve, i was not aware you already created this Jira, i created addendum for letting LRU test depend on small file rather than large one: [https://github.com/apache/hadoop/pull/5843] {quote}also, and this is very, very important, they can't validate the data {quote} i was about to create a sub-task for this as i am planning to refactor Entry to it's own class and have the contents of the linked list data tested in UT (discussed with Mehakmeet in the earlier part of the review). maybe i can do the work as part of this Jira. are you fine with the above addendum PR taking care of using small file in the test (so that we don't need to put the test under -scale) and this Jira being used for refactoring Entry and allowing a UT to test the contents of the linked list? > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Priority: Major > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18805: - Assignee: (was: Viraj Jasani) > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Priority: Major > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18805) s3a large file prefetch tests are too slow, don't validate data
[ https://issues.apache.org/jira/browse/HADOOP-18805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18805: - Assignee: Viraj Jasani > s3a large file prefetch tests are too slow, don't validate data > --- > > Key: HADOOP-18805 > URL: https://issues.apache.org/jira/browse/HADOOP-18805 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > > the large file prefetch tests (including LRU cache eviction) are really slow. > moving under -scale may hide the problem for most runs, but they are still > too slow, can time out, etc etc. > also, and this is very, very important, they can't validate the data. > Better: > * test on smaller files by setting a very small block size (1k bytes or less) > just to force paged reads of a small 16k file. > * with known contents to the values of all forms of read can be validated > * maybe the LRU tests can work with a fake remote object which can then be > used in a unit test > * extend one of the huge file tests to read from there -including s3-CSE > encryption coverage. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18791) S3A prefetching: switch to prefetching for chosen read policies
[ https://issues.apache.org/jira/browse/HADOOP-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17742667#comment-17742667 ] Viraj Jasani commented on HADOOP-18791: --- sounds good, i just realized unbuffer is already in-progress [https://github.com/apache/hadoop/pull/5832] > S3A prefetching: switch to prefetching for chosen read policies > --- > > Key: HADOOP-18791 > URL: https://issues.apache.org/jira/browse/HADOOP-18791 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Major > > before switching to prefetching input stream everywhere, add an option to > list which of the fs.option.openfile.read.policy policies to switch too, e.g > > fs.s3a.inputstream.prefetch.policies=whole-file, sequential, adaptive > this would leave random and vectored on s3a input stream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18791) S3A prefetching: switch to prefetching for chosen read policies
[ https://issues.apache.org/jira/browse/HADOOP-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18791: - Assignee: (was: Viraj Jasani) > S3A prefetching: switch to prefetching for chosen read policies > --- > > Key: HADOOP-18791 > URL: https://issues.apache.org/jira/browse/HADOOP-18791 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Priority: Major > > before switching to prefetching input stream everywhere, add an option to > list which of the fs.option.openfile.read.policy policies to switch too, e.g > > fs.s3a.inputstream.prefetch.policies=whole-file, sequential, adaptive > this would leave random and vectored on s3a input stream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18791) S3A prefetching: switch to prefetching for chosen read policies
[ https://issues.apache.org/jira/browse/HADOOP-18791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18791: - Assignee: Viraj Jasani > S3A prefetching: switch to prefetching for chosen read policies > --- > > Key: HADOOP-18791 > URL: https://issues.apache.org/jira/browse/HADOOP-18791 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > > before switching to prefetching input stream everywhere, add an option to > list which of the fs.option.openfile.read.policy policies to switch too, e.g > > fs.s3a.inputstream.prefetch.policies=whole-file, sequential, adaptive > this would leave random and vectored on s3a input stream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
[ https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17738379#comment-17738379 ] Viraj Jasani commented on HADOOP-18291: --- [~ste...@apache.org] if you have bandwidth to review: [https://github.com/apache/hadoop/pull/5754] Thank you! > S3A prefetch - Implement LRU cache for SingleFilePerBlockCache > -- > > Key: HADOOP-18291 > URL: https://issues.apache.org/jira/browse/HADOOP-18291 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Currently there is no limit on the size of disk cache. This means we could > have a large number of files on files, especially for access patterns that > are very random and do not always read the block fully. > > eg: > in.seek(5); > in.read(); > in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read > in.read(); > in.seek(2 * blockSize + 10) // block 1 gets saved to disk > .. and so on > > The in memory cache is bounded, and by default has a limit of 72MB (9 > blocks). When a block is fully read, and a seek is issued it's released > [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109]. > We can also delete the on disk file for the block here if it exists. > > Also maybe add an upper limit on disk space, and delete the file which stores > data of the block furthest from the current block (similar to the in memory > cache) when this limit is reached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
[ https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18291: -- Status: Patch Available (was: In Progress) > S3A prefetch - Implement LRU cache for SingleFilePerBlockCache > -- > > Key: HADOOP-18291 > URL: https://issues.apache.org/jira/browse/HADOOP-18291 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > Currently there is no limit on the size of disk cache. This means we could > have a large number of files on files, especially for access patterns that > are very random and do not always read the block fully. > > eg: > in.seek(5); > in.read(); > in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read > in.read(); > in.seek(2 * blockSize + 10) // block 1 gets saved to disk > .. and so on > > The in memory cache is bounded, and by default has a limit of 72MB (9 > blocks). When a block is fully read, and a seek is issued it's released > [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109]. > We can also delete the on disk file for the block here if it exists. > > Also maybe add an upper limit on disk space, and delete the file which stores > data of the block furthest from the current block (similar to the in memory > cache) when this limit is reached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18756) CachingBlockManager to use AtomicBoolean for closed flag
[ https://issues.apache.org/jira/browse/HADOOP-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737387#comment-17737387 ] Viraj Jasani commented on HADOOP-18756: --- Steve, could you please help close this Jira? am i allowed to do it? > CachingBlockManager to use AtomicBoolean for closed flag > > > Key: HADOOP-18756 > URL: https://issues.apache.org/jira/browse/HADOOP-18756 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > > the {{CachingBlockManager}} uses the boolean field {{closed)) in various > operations, including a do/while loop. to ensure the flag is correctly > updated across threads, it needs to move to an atomic boolean. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18777) Update jackson2 version from 2.12.7.1 to 2.15.0
[ https://issues.apache.org/jira/browse/HADOOP-18777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735343#comment-17735343 ] Viraj Jasani commented on HADOOP-18777: --- please take a look at the discussion on HADOOP-18033 > Update jackson2 version from 2.12.7.1 to 2.15.0 > --- > > Key: HADOOP-18777 > URL: https://issues.apache.org/jira/browse/HADOOP-18777 > Project: Hadoop Common > Issue Type: Bug >Reporter: ronan doolan >Priority: Major > > can the jackson2 version in hadoop-project be updated from 2.12.7.1 to 2.15.* > This is to rectify the following vulnerability > [https://github.com/FasterXML/jackson-core/pull/827] > https://github.com/apache/hadoop/blob/trunk/hadoop-project/pom.xml#L72 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
[ https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721569#comment-17721569 ] Viraj Jasani edited comment on HADOOP-18291 at 6/17/23 6:48 AM: {quote}you'd maybe want a block cache - readers would lock their block before a read; unlock after. Use an LRU policy for recycling blocks, with unbuffer/close releasing all blocks of a caller. {quote} -if jobs using s3a prefetching get aborted without calling s3afs#close, and prefetched block files are kept on EBS volumes that could be accessed again by new vm instance or container that resume the jobs, we might also want to consider deleting all old local block files as part of s3afs#initialize- was (Author: vjasani): {quote}you'd maybe want a block cache - readers would lock their block before a read; unlock after. Use an LRU policy for recycling blocks, with unbuffer/close releasing all blocks of a caller. {quote} if jobs using s3a prefetching get aborted without calling s3afs#close, and prefetched block files are kept on EBS volumes that could be accessed again by new vm instance or container that resume the jobs, we might also want to consider deleting all old local block files as part of s3afs#initialize > S3A prefetch - Implement LRU cache for SingleFilePerBlockCache > -- > > Key: HADOOP-18291 > URL: https://issues.apache.org/jira/browse/HADOOP-18291 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > > Currently there is no limit on the size of disk cache. This means we could > have a large number of files on files, especially for access patterns that > are very random and do not always read the block fully. > > eg: > in.seek(5); > in.read(); > in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read > in.read(); > in.seek(2 * blockSize + 10) // block 1 gets saved to disk > .. and so on > > The in memory cache is bounded, and by default has a limit of 72MB (9 > blocks). When a block is fully read, and a seek is issued it's released > [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109]. > We can also delete the on disk file for the block here if it exists. > > Also maybe add an upper limit on disk space, and delete the file which stores > data of the block furthest from the current block (similar to the in memory > cache) when this limit is reached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18291) S3A prefetch - Implement LRU cache for SingleFilePerBlockCache
[ https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18291: -- Summary: S3A prefetch - Implement LRU cache for SingleFilePerBlockCache (was: SingleFilePerBlockCache does not have a limit) > S3A prefetch - Implement LRU cache for SingleFilePerBlockCache > -- > > Key: HADOOP-18291 > URL: https://issues.apache.org/jira/browse/HADOOP-18291 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > > Currently there is no limit on the size of disk cache. This means we could > have a large number of files on files, especially for access patterns that > are very random and do not always read the block fully. > > eg: > in.seek(5); > in.read(); > in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read > in.read(); > in.seek(2 * blockSize + 10) // block 1 gets saved to disk > .. and so on > > The in memory cache is bounded, and by default has a limit of 72MB (9 > blocks). When a block is fully read, and a seek is issued it's released > [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109]. > We can also delete the on disk file for the block here if it exists. > > Also maybe add an upper limit on disk space, and delete the file which stores > data of the block furthest from the current block (similar to the in memory > cache) when this limit is reached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+
[ https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731791#comment-17731791 ] Viraj Jasani commented on HADOOP-18763: --- this time, without vpn, all tests passed for prefetch profile as well (previous failures testParallelRename and testThreadPoolCoolDown are no longer showing up with full test run) {code:java} mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch {code} > Upgrade aws-java-sdk to 1.12.367+ > - > > Key: HADOOP-18763 > URL: https://issues.apache.org/jira/browse/HADOOP-18763 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Priority: Major > > aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is > pulling in high severity CVE and creating unhappiness in security scans, even > if s3a doesn't use that lib. > The safe version for netty is netty:4.1.86.Final and this is used by > aws-java-adk:1.12.367+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+
[ https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731782#comment-17731782 ] Viraj Jasani commented on HADOOP-18763: --- mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale results are quite good, no test failures (except for known failure of testRecursiveRootListing, which passes when run individually) > Upgrade aws-java-sdk to 1.12.367+ > - > > Key: HADOOP-18763 > URL: https://issues.apache.org/jira/browse/HADOOP-18763 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Priority: Major > > aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is > pulling in high severity CVE and creating unhappiness in security scans, even > if s3a doesn't use that lib. > The safe version for netty is netty:4.1.86.Final and this is used by > aws-java-adk:1.12.367+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+
[ https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731766#comment-17731766 ] Viraj Jasani commented on HADOOP-18763: --- us-west-2: mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch errors so far: {code:java} [ERROR] Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 1,920.089 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps [ERROR] testParallelRename(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) Time elapsed: 960.003 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 96 milliseconds at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:537) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:88) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.putObject(S3ABlockOutputStream.java:628) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:428) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.parallelRenames(ITestS3AConcurrentOps.java:112) at org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.testParallelRename(ITestS3AConcurrentOps.java:177) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750) [ERROR] testThreadPoolCoolDown(org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps) Time elapsed: 960.005 s <<< ERROR! org.junit.runners.model.TestTimedOutException: test timed out after 96 milliseconds at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:537) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:88) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.putObject(S3ABlockOutputStream.java:628) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.close(S3ABlockOutputStream.java:428) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:77) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) at org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.parallelRenames(ITestS3AConcurrentOps.java:112) at org.apache.hadoop.fs.s3a.scale.ITestS3AConcurrentOps.testThreadPoolCoolDown(ITestS3AConcurrentOps.java:189) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+
[ https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731672#comment-17731672 ] Viraj Jasani commented on HADOOP-18763: --- sure thing, let me test 1.12.367 version today. can perform some manual testing and then do full test run with combination of scale and prefetch profiles. first, i can make it with trunk and once results are good, can repeat the same tests for 3.3. > Upgrade aws-java-sdk to 1.12.367+ > - > > Key: HADOOP-18763 > URL: https://issues.apache.org/jira/browse/HADOOP-18763 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Priority: Major > > aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is > pulling in high severity CVE and creating unhappiness in security scans, even > if s3a doesn't use that lib. > The safe version for netty is netty:4.1.86.Final and this is used by > aws-java-adk:1.12.367+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+
[ https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731662#comment-17731662 ] Viraj Jasani commented on HADOOP-18763: --- [~weichiu] i can help run full test suite with various options if you would like, i anyways run tests on a regular basis. > Upgrade aws-java-sdk to 1.12.367+ > - > > Key: HADOOP-18763 > URL: https://issues.apache.org/jira/browse/HADOOP-18763 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Priority: Major > > aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is > pulling in high severity CVE and creating unhappiness in security scans, even > if s3a doesn't use that lib. > The safe version for netty is netty:4.1.86.Final and this is used by > aws-java-adk:1.12.367+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18763) Upgrade aws-java-sdk to 1.12.367+
[ https://issues.apache.org/jira/browse/HADOOP-18763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731132#comment-17731132 ] Viraj Jasani commented on HADOOP-18763: --- we were excluding netty from aws-sdk? {code:java} com.amazonaws aws-java-sdk-bundle ${aws-java-sdk.version} io.netty * {code} > Upgrade aws-java-sdk to 1.12.367+ > - > > Key: HADOOP-18763 > URL: https://issues.apache.org/jira/browse/HADOOP-18763 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.5 >Reporter: Steve Loughran >Priority: Major > > aws sdk bundle < 1.12.367 uses a vulnerable versions of netty which is > pulling in high severity CVE and creating unhappiness in security scans, even > if s3a doesn't use that lib. > The safe version for netty is netty:4.1.86.Final and this is used by > aws-java-adk:1.12.367+ -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18207) Introduce hadoop-logging module
[ https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729168#comment-17729168 ] Viraj Jasani commented on HADOOP-18207: --- The PR also has higher than usual chances of getting merge conflicts due to the nature of the change. Hence longer it stays open, more merge conflict resolutions are required, that's what happened with previous open PR (5503) for the past 2+ months. Just stating this as a fact (JFYI), doesn't mean i hate resolving conflicts :) > Introduce hadoop-logging module > --- > > Key: HADOOP-18207 > URL: https://issues.apache.org/jira/browse/HADOOP-18207 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There are several goals here: > # Provide the ability to change log level, get log level, etc. > # Place all the appender implementation(?) > # Hide the real logging implementation. > # Later we could remove all the log4j references in other hadoop module. > # Move as much log4j usage to the module as possible. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18207) Introduce hadoop-logging module
[ https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17729167#comment-17729167 ] Viraj Jasani commented on HADOOP-18207: --- created PR with addendum included [https://github.com/apache/hadoop/pull/5717] Ayush sir, sorry you had to do this but you could have given a little more time considering Wei-Chiu's timezone still has weekend (ever since PR was merged)? :) anyways, no worries now we have new PR so we should get jenkins results in 24 hr (as the changes are across almost all modules) {quote}Side Note: Good to check the Jenkins results usually before merging despite any external/trust factors {quote} This was an oversight from my side, not from any reviewers. What happens when we have full QA results, we see _*3 mapreduce tests and 1 hdfs (TestDirectoryScanner) failures*_ most of the times, hence 4 test classes would usually be present but sometimes TestDirectoryScanner would not be present so we might see only 3 test classes in failures. When the last QA results came (before PR merge, with latest merge conflict resolution), somehow i looked at 3 mapreduce failures and 1 hdfs failure but didn't realize that the recent failure is not TestDirectoryScanner and rather it's rbf test failure. The next QA results were posted on PR after the PR was merged, and that is when i immediately realized that we have a new test class and that is not TestDirectoryScanner and hence created addendum PR. I hope this explanation helps. > Introduce hadoop-logging module > --- > > Key: HADOOP-18207 > URL: https://issues.apache.org/jira/browse/HADOOP-18207 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There are several goals here: > # Provide the ability to change log level, get log level, etc. > # Place all the appender implementation(?) > # Hide the real logging implementation. > # Later we could remove all the log4j references in other hadoop module. > # Move as much log4j usage to the module as possible. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18207) Introduce hadoop-logging module
[ https://issues.apache.org/jira/browse/HADOOP-18207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17728919#comment-17728919 ] Viraj Jasani commented on HADOOP-18207: --- PR to fix the broken test: [https://github.com/apache/hadoop/pull/5713] Commented on the original PR as well to link the issue and fix [https://github.com/apache/hadoop/pull/5503#issuecomment-1574535578] Thanks > Introduce hadoop-logging module > --- > > Key: HADOOP-18207 > URL: https://issues.apache.org/jira/browse/HADOOP-18207 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There are several goals here: > # Provide the ability to change log level, get log level, etc. > # Place all the appender implementation(?) > # Hide the real logging implementation. > # Later we could remove all the log4j references in other hadoop module. > # Move as much log4j usage to the module as possible. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18756) CachingBlockManager to use AtomicBoolean for closed flag
[ https://issues.apache.org/jira/browse/HADOOP-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18756: - Assignee: Viraj Jasani > CachingBlockManager to use AtomicBoolean for closed flag > > > Key: HADOOP-18756 > URL: https://issues.apache.org/jira/browse/HADOOP-18756 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.9 >Reporter: Steve Loughran >Assignee: Viraj Jasani >Priority: Major > > the {{CachingBlockManager}} uses the boolean field {{closed)) in various > operations, including a do/while loop. to ensure the flag is correctly > updated across threads, it needs to move to an atomic boolean. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-18740) s3a prefetch cache blocks should be accessed by RW locks
[ https://issues.apache.org/jira/browse/HADOOP-18740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HADOOP-18740: -- Status: Patch Available (was: In Progress) > s3a prefetch cache blocks should be accessed by RW locks > > > Key: HADOOP-18740 > URL: https://issues.apache.org/jira/browse/HADOOP-18740 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > In order to implement LRU or LFU based cache removal policies for s3a > prefetched cache blocks, it is important for all cache reader threads to > acquire read lock and similarly cache file removal mechanism (fs close or > cache eviction) to acquire write lock before accessing the files. > As we maintain the block entries in an in-memory map, we should be able to > introduce read-write lock per cache file entry, we don't need coarse-grained > lock shared by all entries. > > This is a prerequisite to HADOOP-18291. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long
[ https://issues.apache.org/jira/browse/HADOOP-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724060#comment-17724060 ] Viraj Jasani edited comment on HADOOP-18744 at 5/19/23 6:01 AM: Came across few more tests failures while testing HADOOP-18740: {code:java} [ERROR] testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations) Time elapsed: 2.763 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk) Time elapsed: 2.329 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer) Time elapsed: 1.937 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) {code} {code:java} [ERROR] testDeleteNonExistingFileInDir(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI) Time elapsed: 1.809 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerCreateFile(S3AFileSystem.java:1891) {code} was (Author: vjasani): A couple more relevant failures: {code:java} [ERROR] testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations) Time elapsed: 2.763 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk) Time elapsed: 2.329 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer) Time elapsed: 1.937 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063)
[jira] [Comment Edited] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long
[ https://issues.apache.org/jira/browse/HADOOP-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724060#comment-17724060 ] Viraj Jasani edited comment on HADOOP-18744 at 5/19/23 12:07 AM: - A couple more relevant failures: {code:java} [ERROR] testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations) Time elapsed: 2.763 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk) Time elapsed: 2.329 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer) Time elapsed: 1.937 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) {code} {code:java} [ERROR] testDeleteNonExistingFileInDir(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI) Time elapsed: 1.809 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerCreateFile(S3AFileSystem.java:1891) {code} was (Author: vjasani): A couple more relevant failures: {code:java} [ERROR] testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations) Time elapsed: 2.763 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk) Time elapsed: 2.329 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer) Time elapsed: 1.937 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at
[jira] [Commented] (HADOOP-18744) ITestS3ABlockOutputArray failure with IO File name too long
[ https://issues.apache.org/jira/browse/HADOOP-18744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724060#comment-17724060 ] Viraj Jasani commented on HADOOP-18744: --- A couple more relevant failures: {code:java} [ERROR] testCreateFlagCreateAppendNonExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations) Time elapsed: 2.763 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.createBlockIfNeeded(S3ABlockOutputStream.java:235) at org.apache.hadoop.fs.s3a.S3ABlockOutputStream.(S3ABlockOutputStream.java:217) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputDisk) Time elapsed: 2.329 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) {code} {code:java} [ERROR] testDiskBlockCreate(org.apache.hadoop.fs.s3a.ITestS3ABlockOutputByteBuffer) Time elapsed: 1.937 s <<< ERROR! java.io.IOException: File name too long at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2063) at org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377) at org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829) at org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114) {code} > ITestS3ABlockOutputArray failure with IO File name too long > --- > > Key: HADOOP-18744 > URL: https://issues.apache.org/jira/browse/HADOOP-18744 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Ahmar Suhail >Priority: Major > > On an EC2 instance, the following tests are failing: > > {{{}ITestS3ABlockOutputArray.testDiskBlockCreate{}}}{{{}ITestS3ABlockOutputByteBuffer>ITestS3ABlockOutputArray.testDiskBlockCreate{}}}{{{}ITestS3ABlockOutputDisk>ITestS3ABlockOutputArray.testDiskBlockCreate{}}} > > with the error IO File name too long. > > The tests create a file with a 1024 char file name and rely on > File.createTempFile() to truncate the file name to < OS limit. > > Stack trace: > {{Java.io.IOException: File name too long}} > {{ at java.io.UnixFileSystem.createFileExclusively(Native Method)}} > {{ at java.io.File.createTempFile(File.java:2063)}} > {{ at > org.apache.hadoop.fs.s3a.S3AFileSystem.createTmpFileForWrite(S3AFileSystem.java:1377)}} > {{ at > org.apache.hadoop.fs.s3a.S3ADataBlocks$DiskBlockFactory.create(S3ADataBlocks.java:829)}} > {{ at > org.apache.hadoop.fs.s3a.ITestS3ABlockOutputArray.testDiskBlockCreate(ITestS3ABlockOutputArray.java:114)}} > {{ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}} > {{ at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}} > {{ at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18652) Path.suffix raises NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17722430#comment-17722430 ] Viraj Jasani commented on HADOOP-18652: --- no worries at all, feel free to create github pull-request as per your convenience! > Path.suffix raises NullPointerException > --- > > Key: HADOOP-18652 > URL: https://issues.apache.org/jira/browse/HADOOP-18652 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Reporter: Patrick Grandjean >Assignee: Viraj Jasani >Priority: Minor > > Calling the Path.suffix method on root raises a NullPointerException. Tested > with hadoop-client-api 3.3.2 > Scenario: > {code:java} > import org.apache.hadoop.fs.* > Path root = new Path("/") > root.getParent == null // true > root.suffix("bar") // NPE is raised > {code} > Stack: > {code:none} > 23/03/03 15:13:18 ERROR Uncaught throwable from user code: > java.lang.NullPointerException > at org.apache.hadoop.fs.Path.(Path.java:104) > at org.apache.hadoop.fs.Path.(Path.java:93) > at org.apache.hadoop.fs.Path.suffix(Path.java:361) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18652) Path.suffix raises NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18652: - Assignee: (was: Viraj Jasani) > Path.suffix raises NullPointerException > --- > > Key: HADOOP-18652 > URL: https://issues.apache.org/jira/browse/HADOOP-18652 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Reporter: Patrick Grandjean >Priority: Minor > > Calling the Path.suffix method on root raises a NullPointerException. Tested > with hadoop-client-api 3.3.2 > Scenario: > {code:java} > import org.apache.hadoop.fs.* > Path root = new Path("/") > root.getParent == null // true > root.suffix("bar") // NPE is raised > {code} > Stack: > {code:none} > 23/03/03 15:13:18 ERROR Uncaught throwable from user code: > java.lang.NullPointerException > at org.apache.hadoop.fs.Path.(Path.java:104) > at org.apache.hadoop.fs.Path.(Path.java:93) > at org.apache.hadoop.fs.Path.suffix(Path.java:361) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-18652) Path.suffix raises NullPointerException
[ https://issues.apache.org/jira/browse/HADOOP-18652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani reassigned HADOOP-18652: - Assignee: Viraj Jasani > Path.suffix raises NullPointerException > --- > > Key: HADOOP-18652 > URL: https://issues.apache.org/jira/browse/HADOOP-18652 > Project: Hadoop Common > Issue Type: Bug > Components: hdfs-client >Reporter: Patrick Grandjean >Assignee: Viraj Jasani >Priority: Minor > > Calling the Path.suffix method on root raises a NullPointerException. Tested > with hadoop-client-api 3.3.2 > Scenario: > {code:java} > import org.apache.hadoop.fs.* > Path root = new Path("/") > root.getParent == null // true > root.suffix("bar") // NPE is raised > {code} > Stack: > {code:none} > 23/03/03 15:13:18 ERROR Uncaught throwable from user code: > java.lang.NullPointerException > at org.apache.hadoop.fs.Path.(Path.java:104) > at org.apache.hadoop.fs.Path.(Path.java:93) > at org.apache.hadoop.fs.Path.suffix(Path.java:361) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-18291) SingleFilePerBlockCache does not have a limit
[ https://issues.apache.org/jira/browse/HADOOP-18291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17721927#comment-17721927 ] Viraj Jasani commented on HADOOP-18291: --- created HADOOP-18740 for cache file access to go through read-write locks. > SingleFilePerBlockCache does not have a limit > - > > Key: HADOOP-18291 > URL: https://issues.apache.org/jira/browse/HADOOP-18291 > Project: Hadoop Common > Issue Type: Sub-task >Affects Versions: 3.4.0 >Reporter: Ahmar Suhail >Assignee: Viraj Jasani >Priority: Major > > Currently there is no limit on the size of disk cache. This means we could > have a large number of files on files, especially for access patterns that > are very random and do not always read the block fully. > > eg: > in.seek(5); > in.read(); > in.seek(blockSize + 10) // block 0 gets saved to disk as it's not fully read > in.read(); > in.seek(2 * blockSize + 10) // block 1 gets saved to disk > .. and so on > > The in memory cache is bounded, and by default has a limit of 72MB (9 > blocks). When a block is fully read, and a seek is issued it's released > [here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3CachingInputStream.java#L109]. > We can also delete the on disk file for the block here if it exists. > > Also maybe add an upper limit on disk space, and delete the file which stores > data of the block furthest from the current block (similar to the in memory > cache) when this limit is reached. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org