[GitHub] [hudi] KnightChess opened a new issue, #5248: [QUESION] Should filter prop "hoodie.datasource.write.operation" when use spark sql create table

GitBox Thu, 07 Apr 2022 02:14:19 -0700


KnightChess opened a new issue, #5248:
URL: https://github.com/apache/hudi/issues/5248


   when I use spark sql create table and set 
**hoodie.datasource.write.operation**=upsert.
   delete sql （like pr #5215 ）, insert overwrite sql etc will still use  
**hoodie.datasource.write.operation** to update record, not delete, 
insert_overwrite etc.
   
   eg:
   create a table and set hoodie.datasource.write.operation upsert 
   when I use sql to delete, the delete operation key will be overwrite by 
hoodie.datasource.write.operation from table or env, **OPERATION.key -> 
DataSourceWriteOptions.DELETE_OPERATION_OPT_VAL** will not effect, overwrite to 
**upsert**
   ```scala
   withSparkConf(sparkSession, hoodieCatalogTable.catalogProperties) {
     Map(
       "path" -> path,
       RECORDKEY_FIELD.key -> hoodieCatalogTable.primaryKeys.mkString(","),
       TBL_NAME.key -> tableConfig.getTableName,
       HIVE_STYLE_PARTITIONING.key -> 
tableConfig.getHiveStylePartitioningEnable,
       URL_ENCODE_PARTITIONING.key -> tableConfig.getUrlEncodePartitioning,
       KEYGENERATOR_CLASS_NAME.key -> classOf[SqlKeyGenerator].getCanonicalName,
       SqlKeyGenerator.ORIGIN_KEYGEN_CLASS_NAME -> 
tableConfig.getKeyGeneratorClassName,
       OPERATION.key -> DataSourceWriteOptions.DELETE_OPERATION_OPT_VAL,
       PARTITIONPATH_FIELD.key -> tableConfig.getPartitionFieldProp,
       HiveSyncConfig.HIVE_SYNC_MODE.key -> HiveSyncMode.HMS.name(),
       HiveSyncConfig.HIVE_SUPPORT_TIMESTAMP_TYPE.key -> "true",
       HoodieWriteConfig.DELETE_PARALLELISM_VALUE.key -> "200",
       SqlKeyGenerator.PARTITION_SCHEMA -> partitionSchema.toDDL
     )
   } 
   ```
   
   so, when use  sql, what about don't write it to hoodie.properties, confine 
it when sql check, command generated itself in runtime.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] KnightChess opened a new issue, #5248: [QUESION] Should filter prop "hoodie.datasource.write.operation" when use spark sql create table

Reply via email to