Hi Vivekanand, You're right that a manifest with a file that is deleted will be rewritten and replaced. Scan planning will ignore any deleted data file in a manifest. Whether a file is deleted is controlled by the manifest entry's status, which could be added, existing, or deleted. Using those three values, we can easily recover the changes in a given manifest, and the changes from a snapshot because we know the manifests that were created for a given snapshot. And yes, this is part of the specification.
On Thu, Nov 26, 2020 at 4:55 AM Vivekanand Vellanki <vi...@dremio.com> wrote: > Hi, > > I am trying to understand how data file deletion is handled when the > transaction commits. > > From this line of code > <https://github.com/apache/iceberg/blob/e69e52146d27956221ea4df4ad0baf2af7c827cd/core/src/main/java/org/apache/iceberg/ManifestFilterManager.java#L372>, > it looks like the manifest file containing the deleted data file is > rewritten and a new manifest file is created as part of the transaction > that deletes data files. > > This indicates that data-files that are deleted can be ignored while > planning. > > Is the above statement true by Specification? or is this an implementation > detail that can be changed in the future? > > Thanks > Vivek > > -- Ryan Blue Software Engineer Netflix