Hi,

Yes, I know that you can not use recursive deletes for
incremental checkpoints and I didn't suggest it anywhere. I just pointed
out that I would expect multi/bulk deletes to supersede the recursive
deletes feature assuming good underlying implementation.
Also I'm not surprised that multi deletes can be faster. I would
expect/hope for that. I've just raised a point that they don't have to be.
It depends on the underlying file system. However in contrast to the
recursive deletes, with multi deletes I wouldn't expect multi delete to be
potentially slower.

Re the Dawid's PoC. I'm not sure/I don't remember why he proposed
`BulkDeletingFileSystem` over adding a default method to the FileSystem
interface. But it seems to me like a minor point. The majority of Dawid's
PR is about `BulkFileDeleter` interface, not `BulkDeletingFileSystem`, so
about how to use the bulk deletes inside Flink, not how to implement it on
the FileSystem side. Do you maybe have a concrete design proposal for this
feature?

Best,
Piotrek

czw., 30 cze 2022 o 15:12 Yun Tang <myas...@live.com> napisał(a):

> Hi Piotr,
>
> As I said in the original email, you cannot delete folders recursively for
> incremental checkpoints. And If you take a close look at the original
> email, I have shared the experimental results, which proved 29x improvement:
> "A simple experiment shows that deleting 1000 objects with each 5MB size,
> will cost 39494ms with for-loop single delete operations, and the result
> will drop to 1347ms if using multi-delete API in Tencent Cloud."
>
> I think I can leverage some ideas from Dawid's work. And as I said, I
> would introduce the multi-delete API to the original FileSystem class
> instead of introducing another BulkDeletingFileSystem, which makes the file
> system abstraction closer to the modern cloud-based environment.
>
> Best
> Yun Tang
> ________________________________
> From: Piotr Nowojski <pnowoj...@apache.org>
> Sent: Thursday, June 30, 2022 18:25
> To: dev <dev@flink.apache.org>; Dawid Wysakowicz <dwysakow...@apache.org>
> Subject: Re: [DISCUSS] Introduce multi delete API to Flink's FileSystem
> class
>
> Hi,
>
> I presume this would mostly supersede the recursive deletes [1]? I remember
> an argument that the recursive deletes were not obviously better, even if
> the underlying FS was supporting it. I'm not saying that this would have
> been a counter argument against this effort, since every FileSystem could
> decide on its own whether to use the multi delete call or not. But I think
> at the very least it should be benchmarked/compared whether implementing it
> for a particular FS makes sense or not.
>
> Also there seems to be some similar (abandoned?) effort from Dawid, with
> named bulk deletes, with "BulkDeletingFileSystem"? [2] Isn't this basically
> the same thing that you are proposing Yun Tang?
>
> Best,
> Piotrek
>
> [1] https://issues.apache.org/jira/browse/FLINK-13856
> [2]
>
> https://issues.apache.org/jira/browse/FLINK-13856?focusedCommentId=17481712&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17481712
>
> czw., 30 cze 2022 o 11:45 Zakelly Lan <zakelly....@gmail.com> napisał(a):
>
> > Hi Yun,
> >
> > Thanks for bringing this into discussion.
> > I'm +1 to this idea.
> > And IIUC, Flink implements the OSS and S3 filesystem based on the hadoop
> > filesystem interface, which does not provide the multi-delete API, it may
> > take some effort to implement this.
> >
> > Best,
> > Zakelly
> >
> > On Thu, Jun 30, 2022 at 5:36 PM Martijn Visser <martijnvis...@apache.org
> >
> > wrote:
> >
> > > Hi Yun Tang,
> > >
> > > +1 for addressing this problem and your approach.
> > >
> > > Best regards,
> > >
> > > Martijn
> > >
> > > Op do 30 jun. 2022 om 11:12 schreef Feifan Wang <zoltar9...@163.com>:
> > >
> > > > Thanks a lot for the proposal  @Yun Tang ! It sounds great and I
> can't
> > > > find any reason not to make this improvement.
> > > >
> > > >
> > > > ——————————————
> > > > Name: Feifan Wang
> > > > Email: zoltar9...@163.com
> > > >
> > > >
> > > > ---- Replied Message ----
> > > > | From | Yun Tang<myas...@live.com> |
> > > > | Date | 06/30/2022 16:56 |
> > > > | To | dev@flink.apache.org<dev@flink.apache.org> |
> > > > | Subject | [DISCUSS] Introduce multi delete API to Flink's
> FileSystem
> > > > class |
> > > > Hi guys,
> > > >
> > > > As more and more teams move to cloud-based environments. Cloud object
> > > > storage has become the factual technical standard for big data
> > > ecosystems.
> > > > From our experience, the performance of writing/deleting objects in
> > > object
> > > > storage could vary in each call, the FLIP of changelog state-backend
> > had
> > > > ever taken experiments to verify the performance of writing the same
> > data
> > > > with multi times [1], and it proves that p999 latency could be 8x
> than
> > > p50
> > > > latency. This is also true for delete operations.
> > > >
> > > > Currently, after introducing the checkpoint backpressure
> mechanism[2],
> > > the
> > > > newly triggered checkpoint could be delayed due to not cleaning
> > > checkpoints
> > > > as fast as possible [3].
> > > > Moreover, Flink's checkpoint cleanup mechanism cannot leverage
> deleting
> > > > folder API to speed up the procedure with incremental checkpoints[4].
> > > > This is extremely obvious in cloud object storage, and all most all
> > > object
> > > > storage SDKs have multi-delete API to accelerate the performance,
> e.g.
> > > AWS
> > > > S3 [5], Aliyun OSS [6], and Tencentyun COS [7].
> > > > A simple experiment shows that deleting 1000 objects with each 5MB
> > size,
> > > > will cost 39494ms with for-loop single delete operations, and the
> > result
> > > > will drop to 1347ms if using multi-delete API in Tencent Cloud.
> > > >
> > > > However, Flink's FileSystem API refers to the HDFS's FileSystem API
> and
> > > > lacks such a multi-delete API, which is somehow outdated currently in
> > > > cloud-based environments.
> > > > Thus I suggest adding such a multi-delete API to Flink's
> FileSystem[8]
> > > > class and file systems that do not support such a multi-delete
> feature
> > > will
> > > > roll back to a for-loop single delete.
> > > > By doing so, we can at least accelerate the speed of discarding
> > > > checkpoints in cloud environments.
> > > >
> > > > WDYT?
> > > >
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints#FLIP158:Generalizedincrementalcheckpoints-DFSwritelatency
> > > > [2] https://issues.apache.org/jira/browse/FLINK-17073
> > > > [3] https://issues.apache.org/jira/browse/FLINK-26590
> > > > [4]
> > > >
> > >
> >
> https://github.com/apache/flink/blob/1486fee1acd9cd1e340f6d2007f723abd20294e5/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CompletedCheckpoint.java#L315
> > > > [5]
> > > >
> > >
> >
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-multiple-objects.html
> > > > [6]
> > > >
> > >
> >
> https://www.alibabacloud.com/help/en/object-storage-service/latest/delete-objects-8#section-v6n-zym-tax
> > > > [7]
> > > >
> > >
> >
> https://intl.cloud.tencent.com/document/product/436/44018#delete-objects-in-batch
> > > > [8]
> > > >
> > >
> >
> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/fs/FileSystem.java
> > > >
> > > >
> > > > Best
> > > > Yun Tang
> > > >
> > > >
> > >
> >
>

Reply via email to