rdblue commented on issue #244: Detect new fields from Spark StructType 
compared to a reference Iceberg table and provide update schema commit
URL: https://github.com/apache/incubator-iceberg/pull/244#issuecomment-509023025
 
 
   Thanks for working on this, @fbocse. I think it is a good idea to have a way 
to detects changes and prepare an update. But I think we need to be careful 
about some of the choices made in this commit:
   * New features should be done in the core Java library so any client can use 
them, not just Spark
   * It looks like this only detects new fields. What about fields that are 
removed? Also, using a Spark schema for detection means that this can't use 
schema IDs, so it can't tell the difference between a rename and a drop 
followed by add. That's probably another reason to build change detection 
between Iceberg schemas.
   * Usually, we try to keep recursion logic and business logic separate by 
using the visitor pattern. I think that adding a visitor that traverses two 
Iceberg schemas at the same time, like the Parquet `TypeWithSchemaVisitor`, 
would help simplify the code and would be useful for other tasks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to