[
https://issues.apache.org/jira/browse/HADOOP-18194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518021#comment-17518021
]
Daniel Carl Jones edited comment on HADOOP-18194 at 4/6/22 10:37 AM:
---------------------------------------------------------------------
Some considerations:
* Some buckets will be different from others. Some may require requester pays
while others may not. Maybe there are other options in the future.
** Do we need to take an Hadoop configuration instance and mutate it? And
if so, do we strip base/bucket overrides?
* Should we always enable requester pays within S3A for public data sets?
Enabling requester pays within S3A only enables acknowledgement - if the bucket
does not have it enabled it, it has no effect.
was (Author: JIRAUSER284792):
Some considerations:
- Some buckets will be different from others. Some may require requester pays
while others may not. Maybe there are other options in the future.
- Do we need to take an Hadoop configuration instance and mutate it? And if
so, do we strip base/bucket overrides?
- Should we always enable requester pays within S3A for public data sets?
Enabling requester pays within S3A only enables acknowledgement - if the bucket
does not have it enabled it, it has no effect.
> Public dataset class for S3A integration tests
> ----------------------------------------------
>
> Key: HADOOP-18194
> URL: https://issues.apache.org/jira/browse/HADOOP-18194
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Daniel Carl Jones
> Priority: Minor
>
> Introduction of PublicDatasetTestUtils as proposed previously in some of the
> ideas for refactoring S3A incrementally. Some of its responsibilities:
> - Source of truth for getting URI based on public data set.
> - Maybe keep the methods specific to their purpose where possible? We might
> need {{s3a://landsat-pds/scene_list.gz}} specifically for some tests, but
> other tests may just need a bucket with a bunch of keys.
> - Introduce test assumptions about the S3 endpoint or AWS partition. If we’re
> not looking at 'aws' partition, skip test.
> How might we make this generic for non-{{aws}} partition S3 or
> S3API-compatible object stores?
> - Ideally allow for future extension to provide some easy ways to override
> the bucket if tester has an alternative source? I see
> "fs.s3a.scale.test.csvfile" already has a little bit of this.
> - We could have something which takes a path to a hadoop XML config file;
> we'd have a default resource but the maven build could be pointed at another
> via a command line property. this file could contain all the settings for a
> test against a partition or internal s3-compatible store
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]