[GitHub] [hudi] wangxianghu commented on a change in pull request #3488: [HUDI-1674] Add partition level delete example

GitBox Tue, 17 Aug 2021 07:31:31 -0700


wangxianghu commented on a change in pull request #3488:
URL: https://github.com/apache/hudi/pull/3488#discussion_r690429555




##########
File path: 
hudi-examples/src/main/scala/org/apache/hudi/examples/spark/HoodieDataSourceExample.scala
##########
@@ -129,6 +130,43 @@ object HoodieDataSourceExample {
         save(tablePath)
   }
 
+  /**
+   * Deleta data based in data information
+   */
+  def delete(spark: SparkSession, tablePath: String, tableName: String): Unit 
= {
+
+    val roViewDF = spark.read.format("org.apache.hudi").load(tablePath + 
"/*/*/*/*")
+    roViewDF.createOrReplaceTempView("hudi_ro_table")
+    val df = spark.sql("select uuid, partitionpath, ts from  hudi_ro_table 
limit 2")
+
+    df.write.format("org.apache.hudi").
+      options(getQuickstartWriteConfigs).
+      option(PRECOMBINE_FIELD.key, "ts").
+      option(RECORDKEY_FIELD.key, "uuid").
+      option(PARTITIONPATH_FIELD.key, "partitionpath").
+      option(TABLE_NAME.key, tableName).
+      option(OPERATION.key, DELETE_OPERATION_OPT_VAL).
+      mode(Append).
+      save(tablePath)
+  }
+
+  /**
+   *  Delete the data of a single or multiple partitions

Review comment:
       same here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] wangxianghu commented on a change in pull request #3488: [HUDI-1674] Add partition level delete example

Reply via email to