umehrot2 commented on issue #1406: [HUDI-713] Fix conversion of Spark array of 
struct type to Avro schema
URL: https://github.com/apache/incubator-hudi/pull/1406#issuecomment-604137304
 
 
   > Sorry did not mean to hijack this fix.. Just trying to understand how it 
ll break compatibility while we are here.. All this schema namespace business 
is only before writing parquet files right... Once you are able to write 
parquet, it should be readable by parquet-avro for merging? (which has nothing 
to do with apache-spark-avro or databricks-spark-avro)... what causes the 
breakage?
   
   All I can think of is, since the old namespace is stored in the 
`parquet.avro.schema` in the actual parquet file, it might conflict with the 
new schema that has a different namespace. 
   @zhedoubushishi is looking into this.
   
   One good thing is that atleast it should not affect user's using 
`FileBaseSchemaProvider` or `SchemaRegistryProvider` with `DeltaStreamer` in 
which case from what I see we directly use the schema that user has passed.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to