Re: Changing default delete file granularity for Spark writes from partition to file scoped

Amogh Jahagirdar Fri, 15 Nov 2024 10:56:10 -0800

Following up on this thread,

> I don't think this is a bad idea from a theoretical perspective. Do we
have any actual numbers to back up the change?
There are no numbers yet, changing the default is largely driven by the
fact that the previous downside of file scoped deletes leading to many
files on disk, is now mitigated by Spark sync maintenance.


To get some more numbers, I think we'll first need to update some of our
benchmarks to be more representative of typical MoR workloads. For example
DeleteFileIndexBenchmark currently tests tables with a small amount of
partitions with many data files and very few delete files, so this leads to
a huge difference in files between partition scoped and file scoped for
that benchmark and doesn't display the benefits of moving to file scoped
deletes by default. I'll look into updating some of these benchmarks.

Thanks,

Amogh Jahagirdar

On Tue, Nov 12, 2024 at 12:36 PM Anton Okolnychyi <[email protected]>
wrote:

> I support the idea of switching to file-scoped deletes for new tables. The
> absence of sync maintenance prevented us from doing that earlier. Given
> that Amogh recently merged that functionality into main, we should be fine.
> Flink connectors have been using file-scoped deletes for a while now to
> solve issues with concurrent compactions.
>
> File-scoped deletes are closer to DVs and even share some code like
> assignment by path and sync maintenance in Spark. This migration will allow
> us to test many concepts that are needed for DVs prior to completing the V3
> spec. It is also easier to migrate file-scoped deletes to DVs in V3. Unlike
> partition-scoped deletes, they can be discarded from the table state once a
> DV is produced for that data file.
>
> - Anton
>
>
>
> пн, 11 лист. 2024 р. о 21:01 Russell Spitzer <[email protected]>
> пише:
>
>> I don't think this is a bad idea from a theoretical perspective. Do we
>> have any actual numbers to back up the change? I would think for most folks
>> we would recommend just going to V3 rather than changing granularity for
>> their new tables. It would just affect new tables though so I'm not opposed
>> to the change except for on the grounds that we are changing a default.
>>
>> On Mon, Nov 11, 2024 at 1:56 PM Amogh Jahagirdar <[email protected]>
>> wrote:
>>
>>> Hi all,
>>>
>>> I wanted to discuss changing the default position delete file
>>> granularity for Spark from partition to file level for any newly created V2
>>> tables. See this PR [1]
>>>
>>> Context on delete file granularity:
>>>
>>>    - Partition granularity: Writers group delete files for multiple
>>>    data files from the same partition into the same delete file. This leads 
>>> to
>>>    fewer files on disk, but higher read amplification from reading delete
>>>    information from irrelevant data files for a scan.
>>>    - File granularity: Writers write a new delete file for every
>>>    changed data file. More targeted reads of relevant delete information 
>>> occur
>>>    but this can lead to more files on disk.
>>>
>>> With the recent merge of synchronous position delete maintenance on
>>> write in Spark [2], file granularity as a default is more compelling since
>>> reads would be more targeted *and* files would be maintained on disk. I
>>> also recommend folks go through the deletion vector design doc for more
>>> details [3].
>>>
>>> Note that for existing tables with high delete-to-data file ratios,
>>> Iceberg's rewrite position deletes procedure can compact the table and
>>> every subsequent write would continuously maintain the position deletes.
>>> Additionally note that in V3, at most one puffin position delete file is
>>> allowed per data file; what's being discussed here is changing the default
>>> granularity for new V2 tables since it should generally be better after the
>>> sync maintenance addition.
>>>
>>> What are folks' thoughts on this?
>>>
>>> [1] https://github.com/apache/iceberg/pull/11478
>>> [2] https://github.com/apache/iceberg/pull/11273
>>> [3]
>>> https://docs.google.com/document/d/18Bqhr-vnzFfQk1S4AgRISkA_5_m5m32Nnc2Cw0zn2XM/edit?tab=t.0#heading=h.193fl7s89tcg
>>>
>>> Thanks,
>>>
>>> Amogh Jahagirdar
>>>
>>

Re: Changing default delete file granularity for Spark writes from partition to file scoped

Reply via email to