Hi Rahul,

On the specific scenario, if you could raise a GH Support issue with
steps/stacktrace we can certainly help out.

On the first part, we have relied on Avro schema evolution/compatibility
thus far, where you null out the old columns (which is very cheap for
parquet storage anyway).
For tools like delta-streamer, this is enforced by the external schema
registries. However, you are right that Spark data frame path may need some
more work.

Happy to work through this with you on a ticket as well.

thanks
vinoth

On Mon, Dec 7, 2020 at 12:50 PM Rahul Narayanan <[email protected]>
wrote:

> ---------- Forwarded message ---------
> From: Rahul Narayanan <[email protected]>
> Date: Thu, Dec 3, 2020 at 11:46 AM
> Subject: Schema evolution in hudi
> To: [email protected] <[email protected]>
>
>
> Hi Team,
>
> We are interested in writing new columns and maybe removing some columns in
> the future in our dataset. I have read hudi supports schema evolution if it
> is backward compatible. To do a poc I tried writing a spark data frame to
> hudi using schema but it’s failing. How to write a spark data frame to hudi
> specifying the schema explicitly
>
> Thanks in advance
>

Reply via email to