Hi Rahul,
Dropping a column would not be backwards compatible. You can add an empty
column for now to get the dummy column as a workaround. I have added a jira
(https://issues.apache.org/jira/browse/HUDI-1440) to provide an option to
override the schema.
Balaji.V
On Monday, December 7, 2020, 08:04:20 PM PST, Rahul Narayanan
<[email protected]> wrote:
Hi Balaji ,
Adding new column is working but when I try to remove a column by inserting a
new data frame with one column removed it does not work.
On Mon, Dec 7, 2020 at 2:51 PM Balaji Varadarajan <[email protected]>
wrote:
Hi Rahul,
With Spark data frame, the schema is deduced automatically. If you write a
dataframe with schema that is backwards compatible (for eg: with new column
added at the end), it should work seamlessly.
Are you seeing any problems with this approach ?
Thanks,Balaji.V
On Wednesday, December 2, 2020, 10:16:52 PM PST, Rahul Narayanan
<[email protected]> wrote:
Hi Team,
We are interested in writing new columns and maybe removing some columns in the
future in our dataset. I have read hudi supports schema evolution if it is
backward compatible. To do a poc I tried writing a spark data frame to hudi
using schema but it’s failing. How to write a spark data frame to hudi
specifying the schema explicitly
Thanks in advance