[GitHub] [hudi] Priyanka128 opened a new issue, #7242: [SUPPORT] Partition field value lost in table column

GitBox Fri, 18 Nov 2022 04:24:24 -0800


Priyanka128 opened a new issue, #7242:
URL: https://github.com/apache/hudi/issues/7242


   I am running spark-submit job to populate data into hudi tables from kafka 
topics. I have below properties set in my table-config.properties file:
   hoodie.datasource.write.partitionpath.field=partitionFieldColumn
   hoodie.datasource.hive_sync.table=tabledata
   hoodie.datasource.hive_sync.partition_fields=partitionFieldColumn
   hoodie.deltastreamer.keygen.timebased.timestamp.type=EPOCHMILLISECONDS
   hoodie.deltastreamer.keygen.timebased.output.dateformat=yyyy/MM/dd
   
   I am using "partitionFieldColumn" which is of datetime type. Using this, I 
want to have 3 level of partitioning (year -> month -> date). To avoid using 
time in the partitioning, 
"hoodie.deltastreamer.keygen.timebased.output.dateformat" property has the date 
format value. This results in correct partitioning levels but the 
"partitionFieldColumn" column created in the "tabledata" table also has the 
time field truncated, which is data loss.
   
   Is there any way to retain the complete value of the "partitionFieldColumn" 
without truncating the time field?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] Priyanka128 opened a new issue, #7242: [SUPPORT] Partition field value lost in table column

Reply via email to