This is an automated email from the ASF dual-hosted git repository. leesf pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push: new f8e5445 Add Baidu BOS storage support for hudi (#3057) f8e5445 is described below commit f8e5445d4abefa9583ccc605a54b3bc79ca4c517 Author: JunZhang <zhangjunem...@126.com> AuthorDate: Wed Jun 9 20:34:29 2021 +0800 Add Baidu BOS storage support for hudi (#3057) Co-authored-by: zhangjun30 <zhangju...@baidu.com> --- docs/_docs/0_9_bos_filesystem.cn.md | 59 +++++++++++++++++++++++++++++++++++++ docs/_docs/0_9_bos_filesystem.md | 58 ++++++++++++++++++++++++++++++++++++ docs/_docs/2_7_cloud.cn.md | 3 ++ docs/_docs/2_7_cloud.md | 4 ++- 4 files changed, 123 insertions(+), 1 deletion(-) diff --git a/docs/_docs/0_9_bos_filesystem.cn.md b/docs/_docs/0_9_bos_filesystem.cn.md new file mode 100644 index 0000000..0fd44b4 --- /dev/null +++ b/docs/_docs/0_9_bos_filesystem.cn.md @@ -0,0 +1,59 @@ +--- +title: BOS Filesystem +keywords: hudi, hive, baidu, bos, spark, presto +permalink: /docs/bos_hoodie.html +summary: In this page, we go over how to configure Hudi with BOS filesystem. +last_modified_at: 2021-06-09T11:38:24-10:00 +language: cn +--- +这个页面描述了如何让你的Hudi任务使用Baidu BOS存储。 + +## Baidu BOS 部署 + +为了让Hudi使用BOS,需要增加两部分的配置: + +- 为Hudi增加Baidu BOS的相关配置 +- 增加Jar包到classpath + +### Baidu BOS 相关的配置 + +新增下面的配置到你的Hudi能访问的core-site.xml文件。使用你的BOS bucket name替换掉`fs.defaultFS`,使用BOS endpoint地址替换`fs.bos.endpoint`,使用BOS的key和secret分别替换`fs.bos.access.key`和`fs.bos.secret.access.key`,这样Hudi就能读写相应的bucket。 + +```xml +<property> + <name>fs.defaultFS</name> + <value>bos://bucketname/</value> +</property> + +<property> + <name>fs.bos.endpoint</name> + <value>bos-endpoint-address</value> + <description>Baidu bos endpoint to connect to,for example : http://bj.bcebos.com</description> +</property> + +<property> + <name>fs.bos.access.key</name> + <value>bos-key</value> + <description>Baidu access key</description> +</property> + +<property> + <name>fs.bos.secret.access.key</name> + <value>bos-secret-key</value> + <description>Baidu secret key.</description> +</property> + +<property> + <name>fs.bos.impl</name> + <value>org.apache.hadoop.fs.bos.BaiduBosFileSystem</value> +</property> +``` + +### Baidu BOS Libs + +新增Baidu hadoop的jar包添加到classpath. + +- com.baidubce:bce-java-sdk:0.10.165 +- bos-hdfs-sdk-1.0.2-community.jar + +可以从[这里](https://sdk.bce.baidu.com/console-sdk/bos-hdfs-sdk-1.0.2-community.jar.zip) 下载bos-hdfs-sdk jar包,然后解压。 diff --git a/docs/_docs/0_9_bos_filesystem.md b/docs/_docs/0_9_bos_filesystem.md new file mode 100644 index 0000000..f05d180 --- /dev/null +++ b/docs/_docs/0_9_bos_filesystem.md @@ -0,0 +1,58 @@ +--- +title: BOS Filesystem +keywords: hudi, hive, baidu, bos, spark, presto +permalink: /docs/bos_hoodie.html +summary: In this page, we go over how to configure Hudi with bos filesystem. +last_modified_at: 2021-06-09T11:38:24-10:00 +--- +In this page, we explain how to get your Hudi job to store into Baidu BOS. + +## Baidu BOS configs + +There are two configurations required for Hudi-BOS compatibility: + +- Adding Baidu BOS Credentials for Hudi +- Adding required Jars to classpath + +### Baidu BOS Credentials + +Add the required configs in your core-site.xml from where Hudi can fetch them. Replace the `fs.defaultFS` with your BOS bucket name, replace `fs.bos.endpoint` with your bos endpoint, replace `fs.bos.access.key` with your bos key, replace `fs.bos.secret.access.key` with your bos secret key. Hudi should be able to read/write from the bucket. + +```xml +<property> + <name>fs.defaultFS</name> + <value>bos://bucketname/</value> +</property> + +<property> + <name>fs.bos.endpoint</name> + <value>bos-endpoint-address</value> + <description>Baidu bos endpoint to connect to,for example : http://bj.bcebos.com</description> +</property> + +<property> + <name>fs.bos.access.key</name> + <value>bos-key</value> + <description>Baidu access key</description> +</property> + +<property> + <name>fs.bos.secret.access.key</name> + <value>bos-secret-key</value> + <description>Baidu secret key.</description> +</property> + +<property> + <name>fs.bos.impl</name> + <value>org.apache.hadoop.fs.bos.BaiduBosFileSystem</value> +</property> +``` + +### Baidu bos Libs + +Baidu hadoop libraries jars to add to our classpath + +- com.baidubce:bce-java-sdk:0.10.165 +- bos-hdfs-sdk-1.0.2-community.jar + +You can download the bos-hdfs-sdk jar from [here](https://sdk.bce.baidu.com/console-sdk/bos-hdfs-sdk-1.0.2-community.jar.zip) , and then unzip it. \ No newline at end of file diff --git a/docs/_docs/2_7_cloud.cn.md b/docs/_docs/2_7_cloud.cn.md index 73b74b9..84d4d50 100644 --- a/docs/_docs/2_7_cloud.cn.md +++ b/docs/_docs/2_7_cloud.cn.md @@ -24,3 +24,6 @@ language: cn COS和Hudi协同工作所需的配置。 * [IBM Cloud Object Storage](/cn/docs/ibm_cos_hoodie.html) <br/> IBM Cloud Object Storage和Hudi协同工作所需的配置。 +* [Baidu Cloud Object Storage](/docs/bos_hoodie.html) <br/> + 百度BOS和Hudi协同工作所需的配置。 + diff --git a/docs/_docs/2_7_cloud.md b/docs/_docs/2_7_cloud.md index 6b82437..cb9c217 100644 --- a/docs/_docs/2_7_cloud.md +++ b/docs/_docs/2_7_cloud.md @@ -23,4 +23,6 @@ to cloud stores. * [Tencent Cloud Object Storage](/docs/cos_hoodie.html) <br/> Configurations required for COS and Hudi co-operability. * [IBM Cloud Object Storage](/docs/ibm_cos_hoodie.html) <br/> - Configurations required for IBM Cloud Object Storage and Hudi co-operability. + Configurations required for IBM Cloud Object Storage and Hudi co-operability. +* [Baidu Cloud Object Storage](/docs/bos_hoodie.html) <br/> + Configurations required for BOS and Hudi co-operability.