Hi Vivekanand,

You're right that a manifest with a file that is deleted will be rewritten
and replaced. Scan planning will ignore any deleted data file in a
manifest. Whether a file is deleted is controlled by the manifest entry's
status, which could be added, existing, or deleted. Using those three
values, we can easily recover the changes in a given manifest, and the
changes from a snapshot because we know the manifests that were created for
a given snapshot. And yes, this is part of the specification.

On Thu, Nov 26, 2020 at 4:55 AM Vivekanand Vellanki <vi...@dremio.com>
wrote:

> Hi,
>
> I am trying to understand how data file deletion is handled when the
> transaction commits.
>
> From this line of code
> <https://github.com/apache/iceberg/blob/e69e52146d27956221ea4df4ad0baf2af7c827cd/core/src/main/java/org/apache/iceberg/ManifestFilterManager.java#L372>,
> it looks like the manifest file containing the deleted data file is
> rewritten and a new manifest file is created as part of the transaction
> that deletes data files.
>
> This indicates that data-files that are deleted can be ignored while
> planning.
>
> Is the above statement true by Specification? or is this an implementation
> detail that can be changed in the future?
>
> Thanks
> Vivek
>
>

-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to