nsivabalan commented on a change in pull request #1004: [HUDI-328] Adding 
delete api to HoodieWriteClient
URL: https://github.com/apache/incubator-hudi/pull/1004#discussion_r346684533
 
 

 ##########
 File path: hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java
 ##########
 @@ -325,6 +326,31 @@ public static SparkConf registerClasses(SparkConf conf) {
     }
   }
 
+  /**
+   * Deletes a bunch of keys from the Hoodie table, at the supplied commitTime
+   */
+  public JavaRDD<WriteStatus> delete(JavaRDD<HoodieKey> keys, final String 
commitTime) {
+    HoodieTable<T> table = getTableAndInitCtx();
+    try {
+      // De-dupe/merge if needed
+      JavaRDD<HoodieKey> dedupedKeys =
+          combineKeysOnCondition(config.shouldCombineBeforeUpsert(), keys, 
config.getUpsertShuffleParallelism());
+
+      JavaRDD<HoodieRecord<T>> dedupedRecords = 
generateHoodieRecordsToDeleteFromKeys(dedupedKeys);
+      indexTimer = metrics.getIndexCtx();
+      // perform index loop up to get existing location of records
+      JavaRDD<HoodieRecord<T>> taggedRecords = 
index.tagLocation(dedupedRecords, jsc, table);
 
 Review comment:
   yes, my bad. I realized that this would return all records and hence the 
next line where in I filer for records which has non null location. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to