Re: [Discuss] FIP-25: Support Multi-Location for Remote Storage

Liebing Yu Mon, 19 Jan 2026 22:29:32 -0800

Hi Zhe, sorry for the late reply.

The primary focus of this FIP is not to address read/write issues at the
table or partition level, but rather to overcome limitations at the cluster
level. Given the current capabilities of object storage, read/write
performance for a single table or partition is unlikely to be a bottleneck;
however, for a large-scale Fluss cluster, it can easily become one.
Therefore, the core objective here is to distribute the cluster-wide
read/write traffic across multiple remote storage systems.


Best regards,
Liebing Yu


On Wed, 14 Jan 2026 at 16:07, Zhe Wang <[email protected]> wrote:

> Hi Liebing, Thanks for the clarification.
> >1. To clarify, the data is currently split by partition level for
> partitioned tables and by table for non-partitioned tables.
>
> Therefore the main aim of this FIP is improving the speed of read data from
> different partitions, store data speed may still limit for a single system?
>
> Best,
> Zhe Wang
>
> Liebing Yu <[email protected]> 于2026年1月13日周二 19:11写道：
>
> > Hi Zhe, Thanks for the questions!
> >
> > 1. To clarify, the data is currently split by partition level for
> > partitioned tables and by table for non-partitioned tables.
> >
> > 2. Regarding RemoteStorageCleaner, you are absolutely right. Supporting
> > remote.data.dirs there is necessary for a complete cleanup when a table
> is
> > dropped.
> >
> > Thanks for pointing that out!
> >
> > Best regards,
> > Liebing Yu
> >
> >
> > On Mon, 12 Jan 2026 at 17:02, Zhe Wang <[email protected]> wrote:
> >
> > > Hi Liebing,
> > >
> > > Thanks for driving this, I think it's a really useful feature.
> > > I have two small questions:
> > > 1. What's the scope for split data in dirs, I see there's a partitionId
> > in
> > > ZK Data, so the data will spit by partition in different directories,
> or
> > by
> > > bucket?
> > > 2. Maybe it needs to support remote.data.dirs in RemoteStorageCleaner?
> So
> > > we can delete all remoteStorage when delete table.
> > >
> > > Best,
> > > Zhe Wang
> > >
> > > Liebing Yu <[email protected]> 于2026年1月8日周四 20:10写道：
> > >
> > > > Hi devs,
> > > >
> > > > I propose initiating discussion on FIP-25[1]. Fluss leverages remote
> > > > storage systems—such as Amazon S3, HDFS, and Alibaba Cloud OSS—to
> > > deliver a
> > > > cost-efficient, highly available, and fault-tolerant storage solution
> > > > compared to local disk. *However, in production environments, we
> often
> > > find
> > > > that the bandwidth of a single remote storage becomes a bottleneck.
> > > *Taking
> > > > OSS[2] as an example, the typical upload bandwidth limit for a single
> > > > account is 20 Gbit/s (Internal) and 10 Gbit/s (Public). So I
> initiated
> > > this
> > > > FIP which aims to introduce support for multiple remote storage paths
> > and
> > > > enables the dynamic addition of new storage paths without service
> > > > interruption.
> > > >
> > > > Any feedback and suggestions on this proposal are welcome!
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-25%3A+Support+Multi-Location+for+Remote+Storage
> > > > [2]
> > > >
> > > >
> > >
> >
> https://www.alibabacloud.com/help/en/oss/user-guide/limits?spm=a2c63.l28256.help-menu-31815.d_0_0_5.2ac34d06oZYFvK
> > > >
> > > > Best regards,
> > > > Liebing Yu
> > > >
> > >
> >
>

Re: [Discuss] FIP-25: Support Multi-Location for Remote Storage

Reply via email to