[hudi] branch asf-site updated: [HUDI-804] Add Azure support to doc (#1668)

leesf Wed, 27 May 2020 07:56:24 -0700

This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new b4f9ddc  [HUDI-804] Add Azure support to doc (#1668)
b4f9ddc is described below

commit b4f9ddc3dbe1404f2f58dc9aac2279f3fb86e5d1
Author: Gary Li <[email protected]>
AuthorDate: Wed May 27 07:56:06 2020 -0700

    [HUDI-804] Add Azure support to doc (#1668)
    
    Co-authored-by: lamber-ken <[email protected]>
---
 docs/_docs/0_5_oss_filesystem.cn.md   |  2 +-
 docs/_docs/0_5_oss_filesystem.md      | 64 +++++++++++++++++------------------
 docs/_docs/0_6_azure_filesystem.cn.md | 52 ++++++++++++++++++++++++++++
 docs/_docs/0_6_azure_filesystem.md    | 51 ++++++++++++++++++++++++++++
 docs/_docs/2_4_configurations.cn.md   |  4 +++
 docs/_docs/2_4_configurations.md      |  4 +++
 6 files changed, 144 insertions(+), 33 deletions(-)

diff --git a/docs/_docs/0_5_oss_filesystem.cn.md 
b/docs/_docs/0_5_oss_filesystem.cn.md
index a598f34..baaa984 100644
--- a/docs/_docs/0_5_oss_filesystem.cn.md
+++ b/docs/_docs/0_5_oss_filesystem.cn.md
@@ -1,7 +1,7 @@
 ---
 title: OSS Filesystem
 keywords: hudi, hive, aliyun, oss, spark, presto
-permalink: /docs/oss_hoodie.html
+permalink: /cn/docs/oss_hoodie.html
 summary: In this page, we go over how to configure Hudi with OSS filesystem.
 last_modified_at: 2020-04-21T12:50:50-10:00
 language: cn
diff --git a/docs/_docs/0_5_oss_filesystem.md b/docs/_docs/0_5_oss_filesystem.md
index afd114c..766cffa 100644
--- a/docs/_docs/0_5_oss_filesystem.md
+++ b/docs/_docs/0_5_oss_filesystem.md
@@ -19,33 +19,33 @@ There are two configurations required for Hudi-OSS 
compatibility:
 Add the required configs in your core-site.xml from where Hudi can fetch them. 
Replace the `fs.defaultFS` with your OSS bucket name, replace `fs.oss.endpoint` 
with your OSS endpoint, replace `fs.oss.accessKeyId` with your OSS key, replace 
`fs.oss.accessKeySecret` with your OSS secret. Hudi should be able to 
read/write from the bucket.
 
 ```xml
-    <property>
-        <name>fs.defaultFS</name>
-        <value>oss://bucketname/</value>
-    </property>
+<property>
+  <name>fs.defaultFS</name>
+  <value>oss://bucketname/</value>
+</property>
 
-    <property>
-      <name>fs.oss.endpoint</name>
-      <value>oss-endpoint-address</value>
-      <description>Aliyun OSS endpoint to connect to.</description>
-    </property>
+<property>
+  <name>fs.oss.endpoint</name>
+  <value>oss-endpoint-address</value>
+  <description>Aliyun OSS endpoint to connect to.</description>
+</property>
 
-    <property>
-      <name>fs.oss.accessKeyId</name>
-      <value>oss_key</value>
-      <description>Aliyun access key ID</description>
-    </property>
+<property>
+  <name>fs.oss.accessKeyId</name>
+  <value>oss_key</value>
+  <description>Aliyun access key ID</description>
+</property>
 
-    <property>
-      <name>fs.oss.accessKeySecret</name>
-      <value>oss-secret</value>
-      <description>Aliyun access key secret</description>
-    </property>
+<property>
+  <name>fs.oss.accessKeySecret</name>
+  <value>oss-secret</value>
+  <description>Aliyun access key secret</description>
+</property>
 
-    <property>
-      <name>fs.oss.impl</name>
-      <value>org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem</value>
-    </property>
+<property>
+  <name>fs.oss.impl</name>
+  <value>org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem</value>
+</property>
 ```
 
 ### Aliyun OSS Libs
@@ -54,18 +54,18 @@ Aliyun hadoop libraries jars to add to our pom.xml. Since 
hadoop-aliyun depends
 
 ```xml
 <dependency>
-    <groupId>org.apache.hadoop</groupId>
-    <artifactId>hadoop-aliyun</artifactId>
-    <version>3.2.1</version>
+  <groupId>org.apache.hadoop</groupId>
+  <artifactId>hadoop-aliyun</artifactId>
+  <version>3.2.1</version>
 </dependency>
 <dependency>
-    <groupId>com.aliyun.oss</groupId>
-    <artifactId>aliyun-sdk-oss</artifactId>
-    <version>3.8.1</version>
+  <groupId>com.aliyun.oss</groupId>
+  <artifactId>aliyun-sdk-oss</artifactId>
+  <version>3.8.1</version>
 </dependency>
 <dependency>
-    <groupId>org.jdom</groupId>
-    <artifactId>jdom</artifactId>
-    <version>1.1</version>
+  <groupId>org.jdom</groupId>
+  <artifactId>jdom</artifactId>
+  <version>1.1</version>
 </dependency>
 ```
diff --git a/docs/_docs/0_6_azure_filesystem.cn.md 
b/docs/_docs/0_6_azure_filesystem.cn.md
new file mode 100644
index 0000000..af1e290
--- /dev/null
+++ b/docs/_docs/0_6_azure_filesystem.cn.md
@@ -0,0 +1,52 @@
+---
+title: Azure Filesystem
+keywords: hudi, hive, azure, spark, presto
+permalink: /cn/docs/azure_hoodie.html
+summary: In this page, we go over how to configure Hudi with Azure filesystem.
+last_modified_at: 2020-05-25T19:00:57-04:00
+language: cn
+---
+In this page, we explain how to use Hudi on Microsoft Azure.
+
+## Disclaimer
+
+This page is maintained by the Hudi community.
+If the information is inaccurate or you have additional information to add.
+Please feel free to create a JIRA ticket. Contribution is highly appreciated.
+
+## Supported Storage System
+
+There are two storage systems support Hudi .
+
+- Azure Blob Storage
+- Azure Data Lake Gen 2
+
+## Verified Combination of Spark and storage system
+
+#### HDInsight Spark2.4 on Azure Data Lake Storage Gen 2
+This combination works out of the box. No extra config needed.
+
+#### Databricks Spark2.4 on Azure Data Lake Storage Gen 2
+- Import Hudi jar to databricks workspace
+
+- Mount the file system to dbutils.
+  ```scala
+  dbutils.fs.mount(
+    source = "abfss://[email protected]",
+    mountPoint = "/mountpoint",
+    extraConfigs = configs)
+  ```
+- When writing Hudi dataset, use abfss URL
+  ```scala
+  inputDF.write
+    .format("org.apache.hudi")
+    .options(opts)
+    .mode(SaveMode.Append)
+    
.save("abfss://<<storage-account>>.dfs.core.windows.net/hudi-tables/customer")
+  ```
+- When reading Hudi dataset, use the mounting point
+  ```scala
+  spark.read
+    .format("org.apache.hudi")
+    .load("/mountpoint/hudi-tables/customer")
+  ```
diff --git a/docs/_docs/0_6_azure_filesystem.md 
b/docs/_docs/0_6_azure_filesystem.md
new file mode 100644
index 0000000..7421496
--- /dev/null
+++ b/docs/_docs/0_6_azure_filesystem.md
@@ -0,0 +1,51 @@
+---
+title: Azure Filesystem
+keywords: hudi, hive, azure, spark, presto
+permalink: /docs/azure_hoodie.html
+summary: In this page, we go over how to configure Hudi with Azure filesystem.
+last_modified_at: 2020-05-25T19:00:57-04:00
+---
+In this page, we explain how to use Hudi on Microsoft Azure.
+
+## Disclaimer
+
+This page is maintained by the Hudi community.
+If the information is inaccurate or you have additional information to add.
+Please feel free to create a JIRA ticket. Contribution is highly appreciated.
+
+## Supported Storage System
+
+There are two storage systems support Hudi .
+
+- Azure Blob Storage
+- Azure Data Lake Gen 2
+
+## Verified Combination of Spark and storage system
+
+#### HDInsight Spark2.4 on Azure Data Lake Storage Gen 2
+This combination works out of the box. No extra config needed.
+
+#### Databricks Spark2.4 on Azure Data Lake Storage Gen 2
+- Import Hudi jar to databricks workspace
+
+- Mount the file system to dbutils.
+  ```scala
+  dbutils.fs.mount(
+    source = "abfss://[email protected]",
+    mountPoint = "/mountpoint",
+    extraConfigs = configs)
+  ```
+- When writing Hudi dataset, use abfss URL
+  ```scala
+  inputDF.write
+    .format("org.apache.hudi")
+    .options(opts)
+    .mode(SaveMode.Append)
+    
.save("abfss://<<storage-account>>.dfs.core.windows.net/hudi-tables/customer")
+  ```
+- When reading Hudi dataset, use the mounting point
+  ```scala
+  spark.read
+    .format("org.apache.hudi")
+    .load("/mountpoint/hudi-tables/customer")
+  ```
diff --git a/docs/_docs/2_4_configurations.cn.md 
b/docs/_docs/2_4_configurations.cn.md
index 1b1af05..a61ba9a 100644
--- a/docs/_docs/2_4_configurations.cn.md
+++ b/docs/_docs/2_4_configurations.cn.md
@@ -29,6 +29,10 @@ language: cn
    S3和Hudi协同工作所需的配置。
  * [Google Cloud Storage](/cn/docs/gcs_hoodie.html) <br/>
    GCS和Hudi协同工作所需的配置。
+ * [Alibaba Cloud OSS](/cn/docs/oss_hoodie.html) <br/>
+   阿里云和Hudi协同工作所需的配置。
+ * [Microsoft Azure](/cn/docs/azure_hoodie.html) <br/>
+   Azure和Hudi协同工作所需的配置。
 
 ## Spark数据源配置 {#spark-datasource}
 
diff --git a/docs/_docs/2_4_configurations.md b/docs/_docs/2_4_configurations.md
index abb102b..b7ff7d1 100644
--- a/docs/_docs/2_4_configurations.md
+++ b/docs/_docs/2_4_configurations.md
@@ -26,6 +26,10 @@ to cloud stores.
    Configurations required for S3 and Hudi co-operability.
  * [Google Cloud Storage](/docs/gcs_hoodie) <br/>
    Configurations required for GCS and Hudi co-operability.
+ * [Alibaba Cloud OSS](/docs/oss_hoodie.html) <br/>
+   Configurations required for OSS and Hudi co-operability.
+ * [Microsoft Azure](/docs/azure_hoodie.html) <br/>
+   Configurations required for Azure and Hudi co-operability.
 
 ## Spark Datasource Configs {#spark-datasource}

[hudi] branch asf-site updated: [HUDI-804] Add Azure support to doc (#1668)

Reply via email to