parisni commented on issue #5751:
URL: https://github.com/apache/hudi/issues/5751#issuecomment-1146897023

   As a workaround, I am able to use both bulk-insert and OCC but turning:
   ```
       {"hoodie.datasource.write.row.writer.enable", "false"},
   ```
   
   
   from the code, there is two ways of bulk-inserting : `bulkInsertAsRow` which 
leads to NPE with OCC
   and `bulkInsert` which deals correctly with OCC
   
   ```
         if (hoodieConfig.getBoolean(ENABLE_ROW_WRITER) &&
           operation == WriteOperationType.BULK_INSERT) {
           val (success, commitTime: common.util.Option[String]) = 
bulkInsertAsRow(sqlContext, parameters, df, tblName,
             basePath, path, instantTime, partitionColumns)
   
   ```
   
   
   ```
     public JavaRDD<WriteStatus> bulkInsert(JavaRDD<HoodieRecord<T>> records, 
String instantTime, Option<BulkInsertPartitioner> 
userDefinedBulkInsertPartitioner) {
       HoodieTable<T, HoodieData<HoodieRecord<T>>, HoodieData<HoodieKey>, 
HoodieData<WriteStatus>> table =
           initTable(WriteOperationType.BULK_INSERT, 
Option.ofNullable(instantTime));
       table.validateInsertSchema();
       preWrite(instantTime, WriteOperationType.BULK_INSERT, 
table.getMetaClient());
       HoodieWriteMetadata<HoodieData<WriteStatus>> result = 
table.bulkInsert(context,instantTime, HoodieJavaRDD.of(records), 
userDefinedBulkInsertPartitioner);
       HoodieWriteMetadata<JavaRDD<WriteStatus>> resultRDD = 
result.clone(HoodieJavaRDD.getJavaRDD(result.getWriteStatuses()));
       return postWrite(resultRDD, instantTime, table);
     }
   ```
   
   We should make the former aware of OCC to fix that bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to