rdblue commented on a change in pull request #947:
URL: https://github.com/apache/iceberg/pull/947#discussion_r438353365



##########
File path: site/docs/spec.md
##########
@@ -433,8 +433,17 @@ The rows in the delete file must be sorted by `file_path` 
then `position` to opt
 *  Sorting by `file_path` allows filter pushdown by file in columnar storage 
formats.
 *  Sorting by `position` allows filtering rows while scanning, to avoid 
keeping deletes in memory.
 
-Though the delete files can be written using any supported data file format in 
Iceberg, it is recommended to write delete files with same file format as the 
table's file format.
+Position-based delete files can be written using any supported data file 
format in Iceberg, but it is recommended to write delete files with same file 
format as the table's default file format.
 
+#### Equality Delete Files
+
+Equality delete files identify rows in a collection of data files that have 
been deleted by encoding equality predicates. Rows may be identified by more 
than one column.

Review comment:
       Good point. The column cannot be ignored and should still be applied to 
any data file before its sequence number using the same projection logic. That 
is, if a column ID is missing from a data file, it is assumed to be all nulls. 
I think some examples would definitely help clarify this as well.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to