[GitHub] [hudi] lw309637554 commented on a change in pull request #2196: [HUDI-1349]spark sql support overwrite use replace action

GitBox Fri, 23 Oct 2020 05:11:48 -0700


lw309637554 commented on a change in pull request #2196:
URL: https://github.com/apache/hudi/pull/2196#discussion_r510838979




##########
File path: hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala
##########
@@ -93,6 +93,11 @@ private[hudi] object HoodieSparkSqlWriter {
       operation = WriteOperationType.INSERT
     }
 
+    // If the mode is Overwrite, should use INSERT_OVERWRITE operation

Review comment:
       @satishkotha  thanks, 
   1. i think the method is ok . Because in HoodieSparkSqlWriter.scala , if we 
set  operation as replace action, then it also use  
"client.insertOverwrite(hoodieRecords, instantTime); "  here will do as " 
Insert overwrite only updates partitions".
   
    public static HoodieWriteResult doWriteOperation(SparkRDDWriteClient 
client, JavaRDD<HoodieRecord> hoodieRecords,
                                                      String instantTime, 
WriteOperationType operation) throws HoodieException {
       switch (operation) {
         case BULK_INSERT:
           Option<BulkInsertPartitioner> userDefinedBulkInsertPartitioner =
                   createUserDefinedBulkInsertPartitioner(client.getConfig());
           return new HoodieWriteResult(client.bulkInsert(hoodieRecords, 
instantTime, userDefinedBulkInsertPartitioner));
         case INSERT:
           return new HoodieWriteResult(client.insert(hoodieRecords, 
instantTime));
         case UPSERT:
           return new HoodieWriteResult(client.upsert(hoodieRecords, 
instantTime));
         case INSERT_OVERWRITE:
           return client.insertOverwrite(hoodieRecords, instantTime);
         default:
   
   2. https://issues.apache.org/jira/browse/HUDI-1350. is another job.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] lw309637554 commented on a change in pull request #2196: [HUDI-1349]spark sql support overwrite use replace action

Reply via email to