[
https://issues.apache.org/jira/browse/HADOOP-16950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18045040#comment-18045040
]
ASF GitHub Bot commented on HADOOP-16950:
-----------------------------------------
github-actions[bot] closed pull request #1928: HADOOP-16950.Extend Hadoop S3a
access from single endpoint to multipl…
URL: https://github.com/apache/hadoop/pull/1928
> Extend Hadoop S3a access from single endpoint to multiple endpoints
> -------------------------------------------------------------------
>
> Key: HADOOP-16950
> URL: https://issues.apache.org/jira/browse/HADOOP-16950
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 3.1.3
> Reporter: Ocean Lua
> Priority: Major
> Labels: Endpoint, ceph, pull-request-available
> Attachments: HADOOP-16950-001.patch
>
>
> The client API of Hadoop aws can only support a single endpoint to access.
> However, there are multiple endpoints in object storage (such as ceph), and
> therefore the storage resources could not be fully used. To address the
> issue, we create a new Implementation of S3AFileSystem, which support
> multi-endpoint access. After the optimization, system performance will
> increase significantly.
> Usage:
> 1.Ensure hadoop-aws API availiable.
> 2.Copy hadoop-aws-3.1.3.jar and aws-java-sdk-bundle-1.11.271.jar to
> directory share/hadoop/common/lib in hadoop (hadoop-aws-3.1.3.jar and
> aws-java-sdk-bundle-1.11.271.jar are normally located at directory
> share/hadoop/tools/lib).
> 3.In file etc/hadoop/hadoop-env.sh, add the following:
> export HADOOP_CLASSPATH=/(hadoop root
> directory)/share/hadoop/common/lib/hadoop-aws-3.1.3.jar:/(hadoop root
> directory)/share/hadoop/common/lib/hadoop-aws-3.1.3.jar:$HADOOP_CLASSPATH
> 4.Edit configuration file "core-site.xml" and set properties below:
> <property>
> <name>fs.s3a.s3.client.factory.impl</name>
> <value>org.apache.hadoop.fs.s3a.MultiAddrS3ClientFactory</value>
> </property>
> <property>
> <name>fs.s3a.endpoint</name>
>
> <value>[http://addr1:port1,http://addr2:port2|http://addr1:port1%2Chttp//addr2:port2],...</value>
> </property>
> 5.Optional configuration in "core-site.xml":
> <property>
> <name>fs.s3a.S3ClientSelector.class</name>
> <value>org.apache.hadoop.fs.s3a.RandomS3ClientSelector</value>
> </property>
> This configuration is used to set the s3a service selection policy. The
> default value is org.apache.hadoop.fs.s3a.RandomS3ClientSelector, which is a
> completely random selector. The configuration can be set to
> org.apache.hadoop.fs.s3a.PathS3ClientSelector, which is a selector according
> to the file path.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]