ad1happy2go commented on issue #7503:
URL: https://github.com/apache/hudi/issues/7503#issuecomment-1547810419

   @gaoshihang I was not able to reproduce this error , tried multiple 
scenarios with both Hudi version 0.13.0 and 0.10.1.
   Are you still facing this issue, Can you try reproducing with mock up data 
and share us.
   
   Code I used for reference. Please let me know in case of any issues.
   
   ```common_config={
       "hoodie.datasource.write.table.type": "COPY_ON_WRITE",
       "hoodie.datasource.write.recordkey.field": "id",
       "hoodie.datasource.write.precombine.field": "timestamp",
       "hoodie.datasource.write.partitionpath.field": "timestamp__date_",
       'hoodie.table.name': table_name,
       "hoodie.payload.ordering.field": "timestamp",
       'hoodie.datasource.hive_sync.mode': 'hms',
       'hoodie.datasource.hive_sync.use_jdbc': 'false',
   }
   
   schema = StructType([
       StructField("id", IntegerType(), True),
       StructField("b", IntegerType(), True),
       StructField("c", StringType(), True),
       StructField("d", StringType(), True),
       StructField("timestamp", StringType(), True),
       StructField("f", StringType(), True)
   ])
   
   #df = spark.createDataFrame([
   #    Row(id=3, b=2, c='string1', d="2000-01-01", timestamp="2000-01-01 
12:00:00", f = "2000-01-01 12:00:00"),
   #    Row(id=5, b=3, c='string2', d="2000-02-01", timestamp="2000-01-01 
12:00:00", f = "2000-01-01 12:21:20"),
   #    Row(id=6, b=5, c='string3', d="2000-03-01", timestamp="2000-01-01 
12:00:00", f = "2000-01-03 12:00:00")
   #], schema)
   
   df = spark.createDataFrame([
       Row(id=5, b=3, c='string2', d="2000-02-01", timestamp="2000-01-01 
12:00:00", f = "2000-01-01 12:21:20")
   ], schema)
   
   df = df.withColumn("timestamp__date_", to_date(df["timestamp"]))
   df.write.format("org.apache.hudi") \
       .options(**common_config) \
       .option("hoodie.datasource.hive_sync.database",db_name) \
       .option("hoodie.datasource.hive_sync.table",table_name) \
       
.option("hoodie.datasource.hive_sync.partition_fields","timestamp__date_") \
       .option("hoodie.datasource.write.operation","upsert") \
       .mode("append") \
       .save(path)
   
   df.write.format("org.apache.hudi") \
       .options(**common_config) \
       .option("hoodie.datasource.hive_sync.database",db_name) \
       .option("hoodie.datasource.hive_sync.table",table_name) \
       
.option("hoodie.datasource.hive_sync.partition_fields","timestamp__date_") \
       .option("hoodie.datasource.write.operation","delete_partition") \
      .mode("append") \
       .save(path)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to