vinothchandar commented on a change in pull request #1248: Adding delete docs 
to QuickStart
URL: https://github.com/apache/incubator-hudi/pull/1248#discussion_r368248651
 
 

 ##########
 File path: docs/quickstart.md
 ##########
 @@ -109,6 +109,57 @@ Notice that the save mode is now `Append`. In general, 
always use append mode un
 [Querying](#query) the data again will now show updated trips. Each write 
operation generates a new 
[commit](http://hudi.incubator.apache.org/concepts.html) 
 denoted by the timestamp. Look for changes in `_hoodie_commit_time`, `rider`, 
`driver` fields for the same `_hoodie_record_key`s in previous commit. 
 
+## Delete data {#deletes}
+Delete records for the HoodieKeys passed in. Lets first generate a new batch 
of insert and delete the same. Query to verify
+that all records are deleted.
+
+```
+val inserts = convertToStringList(dataGen.generateInserts(10))
+val df = spark.read.json(spark.sparkContext.parallelize(inserts, 2))
+df.write.format("org.apache.hudi").
+    options(getQuickstartWriteConfigs).
+    option(PRECOMBINE_FIELD_OPT_KEY, "ts").
+    option(RECORDKEY_FIELD_OPT_KEY, "uuid").
+    option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
+    option(TABLE_NAME, tableName).
+    mode(Overwrite).
+    save(basePath);
+
+// Fetch the rider value for the batch of records inserted just now
+val roDeleteViewDF = spark.
+    read.
+    format("org.apache.hudi").
+    load(basePath + "/*/*/*/*")
+roDeleteViewDF.registerTempTable("hudi_ro_table")
+spark.sql("select distinct rider from  hudi_ro_table where").show()
+
+// replace the rider value in below query to a value from above. "rider-213" 
is first batch and "rider-284" is second batch.
+val ds = spark.sql("select uuid, partitionPath from hudi_ro_table where rider 
= 'rider-284'")
+
+// issue deletes
 
 Review comment:
   Lets have it after incremental query.. deletes will conclude the flow of 
writing and reading nicely

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to