[GitHub] [hudi] brandon-stanley opened a new issue #2213: How to drop `hoodie.datasource.write.partitionpath.field` fields from a Hudi Dataset?

GitBox Wed, 28 Oct 2020 17:54:53 -0700


brandon-stanley opened a new issue #2213:
URL: https://github.com/apache/hudi/issues/2213



   Hi Hudi Team,
   
   Is it possible to change the behaviour of Hudi when specifying the 
`hoodie.datasource.write.partitionpath.field` configuration for a table? I 
notice that the data is partitioned as expected. However, the dataset also 
contains the columns that were specified in the 
`hoodie.datasource.write.partitionpath.field` configuration. This behaviour is 
different from the native `spark.write.partitionBy` operation, which will 
partition the data based on specified columns and remove the aforementioned 
columns from the data set. Is there a way to match this behaviour?
   
   Here is an example of the behaviour I am referring to: 
https://stackoverflow.com/questions/36164914/prevent-dataframe-partitionby-from-removing-partitioned-columns-from-schema/47104251
   
   Cheers,
   
   Brandon Stanley


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] brandon-stanley opened a new issue #2213: How to drop `hoodie.datasource.write.partitionpath.field` fields from a Hudi Dataset?

Reply via email to