[GitHub] [iceberg] aokolnychyi commented on issue #2068: Procedure for adding files to a Table

GitBox Fri, 05 Feb 2021 10:29:47 -0800


aokolnychyi commented on issue #2068:
URL: https://github.com/apache/iceberg/issues/2068#issuecomment-774208295



   I finally found time to get through recent comments. I believe the 
discussion is moving in the right direction. 
   
   > 1. Add name mapping to the Iceberg spec so that it is well-defined and we 
have test cases to validate
   > 2. Document how name mappings change when a schema evolves (allows adding 
aliases)
   
   +1. We have tested how name mappings behave when a schema is evolved (e.g. 
simple cases like adding columns in the middle). That seems to work fine even 
when the imported files don't have column ids.
   
   > 3. Make sure that when we import files, there is a name mapping set for 
the table.
   
   I think we are already doing it in the snapshot and migrate procedures for 
Spark. If we are to add `add_files`, we can assign a default name mapping too.
   
   > 4. Build correct metadata from imported files based on the name mapping.
   
   We try to respect name mappings while importing files in Spark (not sure 
about Trino) but we trust the name mapping is correct.
   
   > 5. Identify problems with the name mapping, like files with no readable / 
mapped fields or incompatible data types
   
   It sounds like we should enhance our import code that currently reads 
footers to import stats to also perform schema checks?
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] aokolnychyi commented on issue #2068: Procedure for adding files to a Table

Reply via email to