[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3140: [HUDI-2063] Add Doc For Spark Sql Integrates With Hudi

GitBox Wed, 11 Aug 2021 04:59:28 -0700


pengzhiwei2018 commented on a change in pull request #3140:
URL: https://github.com/apache/hudi/pull/3140#discussion_r686760654




##########
File path: website/docs/quick-start-guide.md
##########
@@ -119,6 +142,85 @@ The 
[DataGenerator](https://github.com/apache/hudi/blob/master/hudi-spark/src/ma
 can generate sample inserts and updates based on the the sample trip schema 
[here](https://github.com/apache/hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L57)
 :::
 
+## Create Table
+
+Hudi support create table using spark-sql.
+
+**Create Non-Partitioned Table**
+```sql
+-- create a managed cow table
+create table if not exists h0(
+  id int, 
+  name string, 
+  price double
+) using hudi
+options (
+  type = 'cow',
+  primaryKey = 'id'
+);
+
+-- creae an external mor table
+create table if not exists h1(
+  id int, 
+  name string, 
+  price double,
+  ts bigint
+) using hudi
+location '/tmp/hudi/h0'  
+options (
+  type = 'mor',
+  primaryKey = 'id,name',
+  preCombineField = 'ts' 
+)
+;
+
+-- create a non-primary key table
+create table if not exists h2(
+  id int, 
+  name string, 
+  price double
+) using hudi
+options (
+  type = 'cow'
+);
+```
+**Create Partitioned Table**
+```sql
+create table if not exists h_p0 (
+id bigint,
+name string,
+dt string，
+hh string  
+) using hudi
+location '/tmp/hudi/h_p0'
+options (
+  type = 'cow',
+  primaryKey = 'id',
+  preCombineField = 'ts'
+ ) 
+partitioned by (dt, hh)
+;
+```
+**Create Table On The Exists Table Path**
+
+We can create a table on an exists hudi table path. This is useful to 
read/write from a non-sql hudi table by spark-sql.
+```sql
+ create table h_p1 using hudi 
+ options (
+    primaryKey = 'id',
+    preCombineField = 'ts'
+ )
+ partitioned by (dt)
+ location '/path/to/hudi'
+```
+
+**Create Table Options**
+
+| Parameter Name | Introduction |
+|------------|--------|
+| primaryKey | The primary key names of the table, multiple fields separated 
by commas. |
+| type       | The table type to create. type = 'cow' means a COPY-ON-WRITE 
table,while type = 'mor' means a MERGE-ON-READ table. Default value is 'cow' 
without specified this option.|
+| preCombineField | The Pre-Combine field of the table. |

Review comment:
       Well,  I think make the setting hudi config a separate part is more 
reasonable. As user may want to find how to setting hudi config by spark sql, 
they can easy to find the info in that part. But It's hard to associate this 
with Create Table,  although 
   we can do that in the table options.

##########
File path: website/docs/quick-start-guide.md
##########
@@ -119,6 +142,85 @@ The 
[DataGenerator](https://github.com/apache/hudi/blob/master/hudi-spark/src/ma
 can generate sample inserts and updates based on the the sample trip schema 
[here](https://github.com/apache/hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/QuickstartUtils.java#L57)
 :::
 
+## Create Table
+
+Hudi support create table using spark-sql.
+
+**Create Non-Partitioned Table**

Review comment:
       make sense.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] pengzhiwei2018 commented on a change in pull request #3140: [HUDI-2063] Add Doc For Spark Sql Integrates With Hudi

Reply via email to