rajgowtham24 commented on issue #2075:
URL: https://github.com/apache/hudi/issues/2075#issuecomment-688806064
Hi @tooptoop4 ,
While reading the csv file i have used inferschema option as mentioned below
input_df =
spark.read.format("csv").option("header","true").option("inferschema","true").load("mybucket/sample.csv")
And upon reading the file, i have validated the datatypes using
input_df.dtypes and i can confirm that the Version Column is in INT type
[('NAME', 'string'), ('VERSION', 'int'), ('CHANGED_BY', 'string')]
After writing this into Hudi table, i can still see the Version 9 is getting
inserted, Whereas the Hive Table metadata is now synced with the datatypes
similar to the input_df.dtypes
Let me know for any addition details.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]