Re: [I] [Spec] Linking Schema ID to Data & Delete Files [iceberg]

via GitHub Tue, 19 Aug 2025 10:03:08 -0700


RussellSpitzer commented on issue #13855:
URL: https://github.com/apache/iceberg/issues/13855#issuecomment-3201518962


   I probably would not link schema ID because that alone would not indicate 
the presence of a field (Optional Fields) but we probably should have some way 
in the metrics of determining the difference between a "missing metric" and a 
missing field. Maybe we should be storing "columns written" as a metric for 
file so that the planner can avoid a file. This would require adding a set to 
each datafile entry but it should compress very well since it should be mostly 
identical between all data files in a manifest. 
   
   If the goal is determining which files have which columns we should probably 
start by with that problem statement before jumping to linking to schema. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [I] [Spec] Linking Schema ID to Data & Delete Files [iceberg]

Reply via email to