Re: [DISCUSSION] Native S3 Filesystem in Apache Flink

Ferenc Csaky Fri, 17 Oct 2025 10:49:29 -0700

Hi,

Thanks for starting this discussion. Heavy +1 from me.


AWS v1 is EOL at the end of 2025, so the Hadoop S3 FS has to be updated pretty
soonish as well. But that's not really news fer you guys [1] :) Personally I
don't think that would make this proposal even a tiny bit less important.

What I see nowadays there are more and more Hadoop-less use-cases, so
eliminating Hadoop bloat where it is not a must, IMO is a net gain, period.

One thing that comes to my mind that will need some changes and its involvement
to this change is not trivial is the delegation token framework. Currently it
is also tied to the Hadoop stuff and has some abstract classes in the base S3 FS
module.

Another funny thing I personally experienced and also points out problems with
the current setup is if you use Iceberg with an AWS Glue catalog, you must also
bundle the AWS SDK v2, cause Iceberg depends on that version. So if someone
would like to do that currently they cannot really escape bloating their cp
with both AWS SDK.

Best,
Ferenc

[1] https://issues.apache.org/jira/browse/FLINK-30975



On Friday, October 17th, 2025 at 17:12, Tom Cooper <[email protected]> wrote:

> 
> 
> Hi Samrat,
> 
> +1 from me. I think this would be a brilliant contribution. The Hadoop 
> libraries are often full of CVEs and updating them can be, IMHO, one of the 
> hardest chores in the Flink code base.
> So from a purely maintenance POV I think this work would be valuable. Also 
> having the most up to date AWS Java SDK means we keep up with all the auth 
> requirements and opens up more options for using advanced features in future.
> 
> Frankly, I think in the long term, Flink would be better off moving away from 
> Hadoop altogether (but that is a much bigger discussion).
> 
> Thanks,
> 
> Tom Cooper
> @tomcooper.dev | https://tomcooper.dev
> 
> 
> On Tuesday, 14 October 2025 at 19:19, Samrat Deb [email protected] wrote:
> 
> > Hi All,
> > 
> > Poorvank (cc'ed) and I are writing to start a discussion about a potential
> > improvement for Flink, creating a new, native S3 filesystem independent of
> > Hadoop/Presto.
> > 
> > The goal of this proposal is to address several challenges related to
> > Flink's S3 integration, simplifying flink-s3-filesystem. If this discussion
> > gains positive traction, the next step would be to move forward with a
> > formalised FLIP.
> > 
> > The Challenges with the Current S3 Connectors
> > Currently, Flink offers two primary S3 filesystems, flink-s3-fs-hadoop[1]
> > and flink-s3-fs-presto[2]. While functional, this dual-connector approach
> > has few issues:
> > 
> > 1. The flink-s3-fs-hadoop connector adds an additional dependency to
> > manage. Upgrades like AWS SDK v2 are more dependent on Hadoop/Presto to
> > support first and leverage in flink-s3-filesystem. Sometimes it's
> > restrictive to leverage features directly from the AWS SDK.
> > 
> > 2. The flink-s3-fs-presto connector was introduced to mitigate the
> > performance issues of the Hadoop connector, especially for checkpointing.
> > However, it lacks a RecoverableWriter implementation.
> > Sometimes it's confusing for Flink users, highlighting the need for a
> > single, unified solution.
> > 
> > Proposed Solution:
> > A Native, Hadoop-Free S3 Filesystem
> > 
> > I propose we develop a new filesystem, let's call it flink-s3-fs-native,
> > built directly on the modern AWS SDK for Java v2. This approach would be
> > free of any Hadoop or Presto dependencies. I have done a small prototype to
> > validate [3]
> > 
> > This is motivated by trino<>s3 [4]. The Trino project successfully
> > 
> > undertook a similar migration, moving from Hadoop-based object storage
> > clients to their own native implementations.
> > 
> > The new Flink S3 filesystem would:
> > 
> > 1. Provide a single, unified connector for all S3 interactions, from state
> > backends to sinks.
> > 
> > 2. Implement a high-performance S3RecoverableWriter using S3's Multipart
> > Upload feature, ensuring exactly-once sink semantics.
> > 
> > 3. Offer a clean, self-contained dependency, drastically simplifying setup
> > and eliminating external dependencies.
> > 
> > A Phased Migration Path
> > To ensure a smooth transition, we could adopt a phased approach on a very
> > high level :
> > 
> > Phase 1:
> > Introduce the new native S3 filesystem as an optional, parallel plugin.
> > This would allow for community testing and adoption without breaking
> > existing setups.
> > 
> > Phase 2:
> > Once the native connector achieves feature parity and proven stability, we
> > will update the documentation to recommend it as the default choice for all
> > S3 use cases.
> > 
> > Phase 3:
> > In a future major release, the legacy flink-s3-fs-hadoop and
> > flink-s3-fs-presto connectors could be formally deprecated, with clear
> > migration guides provided for users.
> > 
> > I would love to hear the community's thoughts on this.
> > 
> > A few questions to start the discussion:
> > 
> > 1. What are the biggest pain points with the current S3 filesystem?
> > 
> > 2. Are there any critical features from the Hadoop S3A client that are
> > essential to replicate in a native implementation?
> > 
> > 3. Would a simplified, non-dependent S3 experience be a valuable
> > improvement for Flink use cases?
> > 
> > Cheers,
> > Samrat
> > 
> > [1]
> > https://github.com/apache/flink/tree/master/flink-filesystems/flink-s3-fs-hadoop
> > [2]
> > https://github.com/apache/flink/tree/master/flink-filesystems/flink-s3-fs-presto
> > [3] https://github.com/Samrat002/flink/pull/4
> > [4] https://github.com/trinodb/trino/tree/master/lib/trino-filesystem-s3

Re: [DISCUSSION] Native S3 Filesystem in Apache Flink

Reply via email to