Daniel Carl Jones created HADOOP-18194:
------------------------------------------
Summary: Public dataset class for S3A integration tests
Key: HADOOP-18194
URL: https://issues.apache.org/jira/browse/HADOOP-18194
Project: Hadoop Common
Issue Type: Sub-task
Components: fs/s3
Reporter: Daniel Carl Jones
Introduction of PublicDatasetTestUtils as proposed previously in some of the
ideas for refactoring S3A incrementally. Some of its responsibilities:
- Source of truth for getting URI based on public data set.
- Maybe keep the methods specific to their purpose where possible? We might
need {{s3a://landsat-pds/scene_list.gz}} specifically for some tests, but other
tests may just need a bucket with a bunch of keys.
How can we make this generic for non-{{aws}} partition S3 or S3API-compatible
object stores?
- Introduce test assumptions about the S3 endpoint or AWS partition. If we’re
not looking at 'aws' partition, skip test.
- Ideally allow for future extension to provide some easy ways to override the
bucket if tester has an alternative source? I see "fs.s3a.scale.test.csvfile"
already has a little bit of this.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]