Re: [Discuss] FIP-25: Support Multi-Location for Remote Storage

Liebing Yu Fri, 27 Feb 2026 01:53:38 -0800

Hi Lorenzo, sorry for the late reply.

Thanks for the AWS example! This further solidifies the case for multi-path
support.


Regarding your question about multi-cloud support:
Our current design naturally supports multi-cloud object storage systems.
Since the implementation is built upon a multi-schema filesystem
abstraction (supporting schemes like s3://, oss://, abfs://, etc.), the
system is inherently "cloud-agnostic."

Best regards,
Liebing Yu


On Wed, 4 Feb 2026 at 23:37, Lorenzo Affetti via dev <[email protected]>
wrote:

> This is quite an interesting FIP and I think it is a significant
> enhancement, especially for large-scale clusters.
>
> I think you can also add the AWS case in your motivation:
>
> https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance-design-patterns.html#optimizing-performance-high-request-rate
> AWS automatically scales if requests exceed 5,500 per second for the same
> prefix, which results in transient 503 errors.
> Your approach would eliminate this problem by providing another bucket.
>
> I was wondering if it might also provide the possibility of configuring the
> same Fluss cluster for multi-cloud object storage systems.
> From a design perspective, nothing should prevent me from storing remote
> data on both Azure and AWS at the same time, probably resulting in
> different performance numbers for different partitions/tables.
> Should the design force the use of only 1 filesystem implementation?
>
> Thank you again!
>
> On Fri, Jan 30, 2026 at 7:59 AM Liebing Yu <[email protected]> wrote:
>
> > Hi Yuxia, thanks for the thoughtful response. Let me go through your
> > questions one by one.
> >
> > 1. I think after we support `remote.data.dirs`, different schemas will be
> > supported naturally.
> > 2. Yes, I think we should change from `PbTablePath` to
> > `PbPhysicalTablePath`.
> > 3. Thanks for the reminder. I'll poc authentication in
> > https://github.com/apache/fluss/issues/2518. But it doesn't block the
> > multiple-paths implementation in Fluss server in
> > https://github.com/apache/fluss/issues/2517.
> > 4. For a partition table, the table itself has a remote data dir for
> > metadata (such as lake offset). And each partition has its own remote dir
> > for table data (e.g. kv or log data).
> > 5. Legacy clients can access data in the new cluster.
> >
> >    - If the permissions of the paths specified in `remote.data.dirs` on
> the
> >    new cluster match those configured in `remote.data.dir`, seamless
> > access is
> >    achievable.
> >    - If the permissions are inconsistent, access permissions must be
> >    explicitly configured. For example, when using OSS, a policy granting
> >    access permissions to the account identified by `fs.oss.roleArn` must
> be
> >    configured for each bucket specified in `remote.data.dirs`.
> >
> >
> > Best regards,
> > Liebing Yu
> >
> >
> > On Thu, 29 Jan 2026 at 10:07, Yuxia Luo <[email protected]> wrote:
> >
> > > Hi, Liebing
> > >
> > > Thanks for the detailed FIP. I have a few questions:
> > > 1. Does `remote.data.dirs` support paths with different schemes? For
> > > example:
> > > ```
> > > remote.data.dirs: oss://bucket1/fluss-data, s3://bucket2/fluss-data
> > > ```
> > >
> > > 2. Should `GetFileSystemSecurityTokenRequest` include partition?
> > > The FIP adds `table_path` to the request, but since different
> partitions
> > > may reside on different remote paths (and require different tokens),
> > > should the request also include partition information?
> > >
> > > 3. Just a reminder that `DefaultSecurityTokenManager` will become more
> > > complex...
> > > This is not a blocker, but worth a poc to recoginize any complexity
> > >
> > > 4. I want to confirm my understanding: For a partitioned table, does
> the
> > > table itself have a remote dir, AND each partition also has its own
> > remote
> > > dir?
> > >
> > > Or is it:
> > > - Non-partitioned table → table-level remote dir
> > > - Partitioned table → only partition-level remote dirs (no
> table-level)?
> > >
> > > 5. Can old clients (without table path in token request) still read
> data
> > > from new clusters?
> > > One possibe solution is : For RPCs without table information, the
> server
> > > returns a token for the first dir in `remote.data.dirs`. Or other ways
> > that
> > > allow users to configure the cluster to keep compatibility
> > >
> > >
> > >
> > > On 2026/01/21 03:52:29 Zhe Wang wrote:
> > > > Thanks for your response, now it looks good to me.
> > > >
> > > > Best regards,
> > > > Zhe Wang
> > > >
> > > > Liebing Yu <[email protected]> 于2026年1月20日周二 14:29写道：
> > > >
> > > > > Hi Zhe, sorry for the late reply.
> > > > >
> > > > > The primary focus of this FIP is not to address read/write issues
> at
> > > the
> > > > > table or partition level, but rather to overcome limitations at the
> > > cluster
> > > > > level. Given the current capabilities of object storage, read/write
> > > > > performance for a single table or partition is unlikely to be a
> > > bottleneck;
> > > > > however, for a large-scale Fluss cluster, it can easily become one.
> > > > > Therefore, the core objective here is to distribute the
> cluster-wide
> > > > > read/write traffic across multiple remote storage systems.
> > > > >
> > > > > Best regards,
> > > > > Liebing Yu
> > > > >
> > > > >
> > > > > On Wed, 14 Jan 2026 at 16:07, Zhe Wang <[email protected]>
> > wrote:
> > > > >
> > > > > > Hi Liebing, Thanks for the clarification.
> > > > > > >1. To clarify, the data is currently split by partition level
> for
> > > > > > partitioned tables and by table for non-partitioned tables.
> > > > > >
> > > > > > Therefore the main aim of this FIP is improving the speed of read
> > > data
> > > > > from
> > > > > > different partitions, store data speed may still limit for a
> single
> > > > > system?
> > > > > >
> > > > > > Best,
> > > > > > Zhe Wang
> > > > > >
> > > > > > Liebing Yu <[email protected]> 于2026年1月13日周二 19:11写道：
> > > > > >
> > > > > > > Hi Zhe, Thanks for the questions!
> > > > > > >
> > > > > > > 1. To clarify, the data is currently split by partition level
> for
> > > > > > > partitioned tables and by table for non-partitioned tables.
> > > > > > >
> > > > > > > 2. Regarding RemoteStorageCleaner, you are absolutely right.
> > > Supporting
> > > > > > > remote.data.dirs there is necessary for a complete cleanup
> when a
> > > table
> > > > > > is
> > > > > > > dropped.
> > > > > > >
> > > > > > > Thanks for pointing that out!
> > > > > > >
> > > > > > > Best regards,
> > > > > > > Liebing Yu
> > > > > > >
> > > > > > >
> > > > > > > On Mon, 12 Jan 2026 at 17:02, Zhe Wang <[email protected]>
> > > wrote:
> > > > > > >
> > > > > > > > Hi Liebing,
> > > > > > > >
> > > > > > > > Thanks for driving this, I think it's a really useful
> feature.
> > > > > > > > I have two small questions:
> > > > > > > > 1. What's the scope for split data in dirs, I see there's a
> > > > > partitionId
> > > > > > > in
> > > > > > > > ZK Data, so the data will spit by partition in different
> > > directories,
> > > > > > or
> > > > > > > by
> > > > > > > > bucket?
> > > > > > > > 2. Maybe it needs to support remote.data.dirs in
> > > > > RemoteStorageCleaner?
> > > > > > So
> > > > > > > > we can delete all remoteStorage when delete table.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Zhe Wang
> > > > > > > >
> > > > > > > > Liebing Yu <[email protected]> 于2026年1月8日周四 20:10写道：
> > > > > > > >
> > > > > > > > > Hi devs,
> > > > > > > > >
> > > > > > > > > I propose initiating discussion on FIP-25[1]. Fluss
> leverages
> > > > > remote
> > > > > > > > > storage systems—such as Amazon S3, HDFS, and Alibaba Cloud
> > > OSS—to
> > > > > > > > deliver a
> > > > > > > > > cost-efficient, highly available, and fault-tolerant
> storage
> > > > > solution
> > > > > > > > > compared to local disk. *However, in production
> environments,
> > > we
> > > > > > often
> > > > > > > > find
> > > > > > > > > that the bandwidth of a single remote storage becomes a
> > > bottleneck.
> > > > > > > > *Taking
> > > > > > > > > OSS[2] as an example, the typical upload bandwidth limit
> for
> > a
> > > > > single
> > > > > > > > > account is 20 Gbit/s (Internal) and 10 Gbit/s (Public). So
> I
> > > > > > initiated
> > > > > > > > this
> > > > > > > > > FIP which aims to introduce support for multiple remote
> > storage
> > > > > paths
> > > > > > > and
> > > > > > > > > enables the dynamic addition of new storage paths without
> > > service
> > > > > > > > > interruption.
> > > > > > > > >
> > > > > > > > > Any feedback and suggestions on this proposal are welcome!
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-25%3A+Support+Multi-Location+for+Remote+Storage
> > > > > > > > > [2]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://www.alibabacloud.com/help/en/oss/user-guide/limits?spm=a2c63.l28256.help-menu-31815.d_0_0_5.2ac34d06oZYFvK
> > > > > > > > >
> > > > > > > > > Best regards,
> > > > > > > > > Liebing Yu
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> Lorenzo Affetti
> Senior Software Engineer @ Flink Team
> Ververica <http://www.ververica.com>
>

Re: [Discuss] FIP-25: Support Multi-Location for Remote Storage

Reply via email to