[
https://issues.apache.org/jira/browse/HADOOP-19057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812857#comment-17812857
]
ASF GitHub Bot commented on HADOOP-19057:
-----------------------------------------
steveloughran opened a new pull request, #6515:
URL: https://github.com/apache/hadoop/pull/6515
Moves to new test file/bucket
s3a://nyc-tlc/trip data/fhvhv_tripdata_2019-02.parquet
this is actually quite an interesting path as it has a space in and breaks
s3guard tool uri parsing. fix: those tests just take the root schema/host and
not the rest
Rename all methods about ExternalFile rather than CSV file, as we no longer
expect it to be CSV.
Leaves the test key name alone: fs.s3a.scale.test.csvfile
### How was this patch tested?
Indentification of all test suites using the file, running through the IDE.
One Failure: ITestS3AAWSCredentialsProvider.testAnonymousProvider(); the
store doesn't take anonymous credentials.
Full test ongoing.
Because of the failure and the way the space in the path breaks the
ITestS3GuardTool test, I'm not sure this is the right dataset. I'd prefer data
in a bucket which supports anonymous access.
### For code changes:
- [X] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
> S3 public test bucket landsat-pds unreadable -needs replacement
> ---------------------------------------------------------------
>
> Key: HADOOP-19057
> URL: https://issues.apache.org/jira/browse/HADOOP-19057
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, test
> Affects Versions: 3.4.0, 3.2.4, 3.3.9, 3.3.6, 3.5.0
> Reporter: Steve Loughran
> Priority: Critical
>
> The s3 test bucket used in hadoop-aws tests of S3 select and large file reads
> is no longer publicly accessible
> {code}
> java.nio.file.AccessDeniedException: landsat-pds: getBucketMetadata() on
> landsat-pds: software.amazon.awssdk.services.s3.model.S3Exception: null
> (Service: S3, Status Code: 403, Request ID: 06QNYQ9GND5STQ2S, Extended
> Request ID:
> O+u2Y1MrCQuuSYGKRAWHj/5LcDLuaFS8owNuXXWSJ0zFXYfuCaTVLEP351S/umti558eKlUqV6U=):null
> {code}
> * Because HADOOP-18830 has cut s3 select, all we need in 3.4.1+ is a large
> file for some reading tests
> * changing the default value disables s3 select tests on older releases
> * if fs.s3a.scale.test.csvfile is set to " " then other tests which need it
> will be skipped
> Proposed
> * we locate a new large file under the (requester pays) s3a://usgs-landsat/
> bucket . All releases with HADOOP-18168 can use this
> * update 3.4.1 source to use this; document it
> * do something similar for 3.3.9 + maybe even cut s3 select there too.
> * document how to use it on older releases with requester-pays support
> * document how to completely disable it on older releases.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]