Hi Igor,

It is because the Spark ParquetFileFormat infer schema from the parquet
file under 20200205 dir, and the file do not contains the added
column(direction), you would just try `val hudiDF2 =
spark.read.format("org.apache.hudi").option("mergeSchema",
"true").load("/tmp/hudi/drivers/*")` to get schema merged from 20200205 and
20200206, and it shows the added column, I do not know whether it is a
common soulution but it solves the problem.

Best,
Leesf
`

Igor Basko <[email protected]> 于2020年2月5日周三 下午3:33写道:

> Hi All,
> I've tried to write data with some schema changes using the Datasource
> Writer.
> The procedure was:
> First I wrote an event with a specific schema.
> After that I wrote a different event with the same schema but with one more
> added field.
>
> When I read from the Hudi table, I get both the events, with the original
> schema.
> I was expecting to get both events with the newer schema with some default
> value in the new
> field for the first event.
>
> I've created a gist that describes my experience:
> https://gist.github.com/igorbasko01/4a1d0cf7c06a5b216382260efaa1f333
>
> Would like to know, if schema evolution is supported using the Datasource
> Writer.
> Or maybe I'm doing something wrong.
>
> Thanks a lot.
>

Reply via email to