[
https://issues.apache.org/jira/browse/FLINK-35739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Piotr Nowojski updated FLINK-35739:
-----------------------------------
Component/s: Runtime / State Backends
> FLIP-444: Native file copy support
> ----------------------------------
>
> Key: FLINK-35739
> URL: https://issues.apache.org/jira/browse/FLINK-35739
> Project: Flink
> Issue Type: New Feature
> Components: Connectors / FileSystem, Runtime / State Backends
> Reporter: Piotr Nowojski
> Priority: Major
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-444%3A+Native+file+copy+support
> State downloading in Flink can be a time and CPU consuming operation, which
> is especially visible if CPU resources per task slot are strictly restricted
> to for example a single CPU. Downloading 1GB of state size can take
> significant amount of time, while the code doing so is quite inefficient.
> Currently when downloading state files, Flink is creating an
> FSDataInputStream from the remote file, and copies its bytes, to an
> OutputStream pointing to a local file (in the
> RocksDBStateDownloader#downloadDataForStateHandle method). FSDataInputStream
> internally is being wrapped by many layers of abstractions and indirections
> and what’s worse, every file is being copied individually, which leads to
> quite high overheads for small files. Download times and download process CPU
> efficiency can be significantly improved if we introduced an API to allow
> org.apache.flink.core.fs.FileSystem to copy many files natively and all at
> once.
> For S3, there are at least two potential implementations. The first one is
> using AWS SDKv2 directly (Flink currently is using AWS SDKv1 wrapped by
> hadoop/presto) and Amazon S3 Transfer Manager. Second option is to use a 3rd
> party tool called s5cmd. It is claimed to be a faster alternative to the
> official AWS clients, which was confirmed by our benchmarks.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)