[
https://issues.apache.org/jira/browse/HADOOP-19236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17899090#comment-17899090
]
Jinglun commented on HADOOP-19236:
----------------------------------
Thanks [~openinx] [~Ddupg] [~xinxianyin] [~stayrascal] [~fangbo_worker]
@yuanzhihuan for your great works. They are the contributors of hadoop-tos
module. I have reviewed all the PRs and merged into branch HADOOP-19236.
Let me do a brief introduction of the hadoop-tos integration work.
*Hadoop-tos module*
A new hadoop-tos module is added to hadoop-cloud-storage-project, with 2
sub-modules hadoop-tos-core and hadoop-tos-shade. The hadoop-tos-shade is used
to shade all tos sdk dependencies to avoid potential conflicts. The
hadoop-tos-core module contains the tos filesystem implementation.
The final output is a bundle jar placed under hadoop-tos-core named
hadoop-tos-core-\{version}.jar. The tos-sdk dependencies are packaged into the
final output jar. Put the jar under $HADOOP_HOME/share/hadoop/hdfs then it is
able to access tos. See documents in hadoop-tos for more details.
*Dependencies*
Hadoop-tos involves a new dependency `com.volcengine:ve-tos-java-sdk:2.8.6`. It
is an open source project with apache 2.0 license
(https://github.com/volcengine/ve-tos-java-sdk/blob/main/LICENSE).
Here are the dependencies involved by `com.volcengine:ve-tos-java-sdk:2.8.6`.
They (okhttp, okio, kotlin, jackson) are open source with apache 2.0 too.
```
[INFO] org.apache.hadoop:hadoop-tos-shade:jar:3.5.0-SNAPSHOT
[INFO] \- com.volcengine:ve-tos-java-sdk:jar:2.8.6:compile
[INFO] +- com.squareup.okhttp3:okhttp:jar:4.10.0:compile
[INFO] | +- com.squareup.okio:okio-jvm:jar:3.0.0:compile
[INFO] | | +- org.jetbrains.kotlin:kotlin-stdlib-jdk8:jar:1.6.20:test
[INFO] | | | \- org.jetbrains.kotlin:kotlin-stdlib-jdk7:jar:1.6.20:test
[INFO] | | \- org.jetbrains.kotlin:kotlin-stdlib-common:jar:1.6.20:compile
[INFO] | \- org.jetbrains.kotlin:kotlin-stdlib:jar:1.6.20:compile
[INFO] | \- org.jetbrains:annotations:jar:13.0:compile
[INFO] +- com.fasterxml.jackson.core:jackson-annotations:jar:2.12.7:compile
[INFO] +- com.fasterxml.jackson.core:jackson-databind:jar:2.12.7.1:compile
[INFO] | \- com.fasterxml.jackson.core:jackson-core:jar:2.12.7:compile
[INFO] \- org.slf4j:slf4j-api:jar:1.7.36:compile
```
All the dependencies(excluding slf4j) are shaded to avoid potential conflicts.
*How to run unit tests*
To run hadoop-tos unit tests, you need a server that can connect TOS. See
documents in hadoop-tos for more details. I can provide an environment for
test, please let me know if you need to test hadoop-tos ([email protected]).
*Documents*
The doc is placed under hadoop-tos-core module. Find it at
`src/site/markdown/cloudstorage/index.md`.
*Works in the future*
# FileSystem#createBulkDelete is a useful interface, it would be nice to
implement it.
# Maybe adding jars from hadoop-cloud-project to hadoop-dist. Currently they
are not included by the final tar file.
I think it is ready for a public review now. Hi [[email protected]]
[~hexiaoqiao] [~leosun] , could you kindly take a look at this, thanks very
much !
> Integration of Volcano Engine TOS in Hadoop.
> --------------------------------------------
>
> Key: HADOOP-19236
> URL: https://issues.apache.org/jira/browse/HADOOP-19236
> Project: Hadoop Common
> Issue Type: New Feature
> Components: fs, tools
> Affects Versions: 3.4.0
> Reporter: Jinglun
> Assignee: Jinglun
> Priority: Major
> Attachments: Integration of Volcano Engine TOS in Hadoop.pdf
>
>
> Volcano Engine is a fast growing cloud vendor launched by ByteDance, and TOS
> is the object storage service of Volcano Engine. A common way is to store
> data into TOS and run Hadoop/Spark/Flink applications to access TOS. But
> there is no original support for TOS in hadoop, thus it is not easy for users
> to build their Big Data System based on TOS.
>
> This work aims to integrate TOS with Hadoop to help users run their
> applications on TOS. Users only need to do some simple configuration, then
> their applications can read/write TOS without any code change. This work is
> similar to AWS S3, AzureBlob, AliyunOSS, Tencnet COS and HuaweiCloud Object
> Storage in Hadoop.
>
> Please see the attached document "Integration of Volcano Engine TOS in
> Hadoop" for more details.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]