Re: [DISCUSS] Allow user to track the tiering status of a tiering table

SeungMin Lee Tue, 03 Mar 2026 07:12:48 -0800

Hi Keith,

Thank you for the thoughtful questions! These are great points that help
clarify the scope and motivation of FIP-30. Here are some thoughts on those
points from my perspective.


*1.*
That’s a really good question. In our current design, the tiering service
is a long-running job that handles many tables at the same time.

*If tiering fails for just one table, we usually don't want to fail the
entire flink job.* Instead, the system handles it at the table level and *tries
only the failed tables again later*. Because of this, the flink dashboard
will still show the job as *RUNNING*, which makes it hard for users to know
if a specific table is silently failing. FIP-30 is meant to fill this
visibility gap by showing which table is having trouble and why, without
needing to dig through the logs.


*2.*
I see what you mean about the different personas. Our proposal was to
support both workflows

   - Flink SQL: This is mainly for users who want to build quick dashboards
   in tools like Grafana. Since most monitoring tools speak SQL, it’s a very
   easy way to get a high-level view without writing extra code.
   - Admin API: This is for engineers who prefer to write scripts or use
   CLI tools for more automated operations.

By supporting both, I thought we could make the status information useful
for everyone, regardless of their preferred tools.

*3.*
I really like the idea of expanding *show jobs* in Flink that would be a
great improvement for Flink's own observability.

For this specific case, though, I think there are some technical hurdles.
The source of truth for tiering metadata (like `last_tiered_time` and
table-specific errors) is managed by the Fluss Coordinator. From Flink's
perspective, it’s just running a generic job and isn't aware of
Fluss-specific details like table path.

Also, since the tiering job handles these errors internally to keep
running, the Flink engine doesn't capture them as job-level exceptions. *To
show this in Flink SQL, Flink would need to understand Fluss’s internal
state, *which might break the clean separation between the execution engine
and the metadata layer. Keeping it in Fluss feels a bit more natural since
that’s where the table state is already managed.

Thanks again for these questions, they were really helpful in re-examining
and revisiting the motivation behind this proposal!

Best regards,
SeungMin Lee

2026년 3월 2일 (월) AM 12:18, Keith Lee <[email protected]>님이 작성:

> Hello SeungMin,
>
> Thank you for the detailed and laid out proposal. I am not familiar with
> tiering failure and its visibility issues, have a few question for you that
> will hopefully help my understanding.
>
> 1. On the premise `there's no way for users to check the status of lake
> tiering`: my understanding is that tiering is in itself a Flink job, would
> the status of the Flink tiering job be a good signal for status of tiering?
> I assume that if there are tiering issue, it would surface as Flink job
> failure and the job would retry with exception logs captured against the
> job and can be seen through Flink dashboard? Can you clarify if this is
> true and additionally, what additional information that the proposal
> captures?
> 2. I really like the idea of exposing job metadata on Flink SQL. However,
> thinking about user persona, there's two groups here that I can identify,
> a. Data Scientists (or similar roles) b. System/Software Engineer
> (reliability, operations). The information here that the proposal seeks to
> expose serves the second group and not the first. Is Flink SQL therefore
> the correct channel to expose this information?
> 3. I think I mentioned I really like the idea of exposing job metadata on
> Flink SQL. 😄 Have we considered if Fluss is the best place to implement
> Flink SQL support for job metadata query? I can see where such a feature is
> useful in Flink in general. If job health, failure reason etc. is queryable
> in Flink, it can be used in a much broader use-case. Perhaps we can engage
> Flink community on expanding SHOW JOBS [1] to include exception, last
> failure reason etc.?
>
> Best regards
> Keith Lee
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/table/sql/job/
>
>
> On Mon, Feb 23, 2026 at 3:29 PM SeungMin Lee <[email protected]> wrote:
>
> > Hi dev,
> >
> > Hope you had a refreshing break.
> >
> > Touching base on FIP-30. I'm aiming to wrap up the feedback process by
> the
> > week the 0.9 release vote
> > <https://lists.apache.org/thread/3c8w6ofrssjxrpvz85pkm2n2kx1gyzxd> ends,
> > so
> > we can stay aligned with the project timeline. Also, hope the 0.9 release
> > vote <https://lists.apache.org/thread/3c8w6ofrssjxrpvz85pkm2n2kx1gyzxd>
> > gets plenty of interest as well.
> >
> > Looking forward to your thoughts.
> >
> > Best regards,
> > SeungMin Lee
> >
> > 2026년 2월 15일 (일) AM 12:43, SeungMin Lee <[email protected]>님이 작성:
> >
> > > Hi Mehul Batra,
> > >
> > > First of all, thank you very much for the detailed review and valuable
> > > suggestions. I really appreciate your insights.
> > >
> > > *1. Per-Table System Table vs Global System Table*
> > > I think, the use case for the global view is to easily integrate with
> > > monitoring tools like grafana. Without a sql interface, users have to
> > build
> > > a custom exporter using Admin API to monitor the tiering status of all
> > > tables. I do share your concerns regarding the performance impact when
> > > querying thousands of tables. While I acknowledge the potential
> > performance
> > > risks in massive clusters, I believe it’s better to provide full
> > visibility
> > > first. We can monitor real-world performance data and, if necessary,
> > > implement safeguards like implicit limits or forced LIMIT clauses as a
> > > follow-up optimization.
> > >
> > >
> > > *2. Error Message Truncation Strategy*
> > > It is a great point. Simply truncating the head of the error message
> > might
> > > indeed cut off some important information. I agree with your suggestion
> > > "Smart extraction" that prioritizes the phrase near words like "*Caused
> > > by*". To keep the initial FIP-30 scope focused, I plan to implement
> basic
> > > truncation first. However, I would be very grateful if you could help
> > with
> > > the smart extraction as a follow-up pr if you have the capacity.
> > >
> > >
> > > *3. Consolidating State Maps in LakeTableTieringManager*
> > > I also fully agree with consolidating the maps in
> > LakeTableTieringManager.
> > > Looking at the code again, managing 7 separate maps (and soon 9) for
> each
> > > table is getting a bit complicated. It’s quite easy to miss one map
> when
> > > registering or removing tables, which could lead to bugs or small
> memory
> > > leaks over time. Grouping everything into a single TableTieringInfo
> > object
> > > will make the logic much easier to follow and help keep all the
> metadata
> > > consistent. Plus, it should be a bit more memory-efficient by reducing
> > the
> > > number of internal map nodes. I’ll definitely include this refactoring
> as
> > > part of the FIP-30 implementation.
> > >
> > >
> > > Thanks again for helping refine the design!
> > >
> > > Best Regards,
> > > SeungMin Lee
> > >
> > >
> > > 2026년 2월 14일 (토) AM 2:22, Mehul Batra <[email protected]>님이 작성:
> > >
> > >>  Hi SeungMinLee,
> > >>
> > >>
> > >>
> > >>   First of all, thank you for putting together FIP-30. The ability
> > >>
> > >>   Tracking tiering status is a much-needed feature, and I appreciate
> the
> > >> thorough
> > >>   design work that went into this proposal.
> > >>
> > >>
> > >>
> > >>   After reviewing the FIP, I have a few thoughts and questions I'd
> like
> > to
> > >> raise
> > >>   for discussion. These are suggestions based on my understanding - I
> > may
> > >> be
> > >>   missing context, so please feel free to correct me if any of these
> > >> points
> > >> have
> > >>   already been considered.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>   1. Per-Table System Table vs Global System Table
> > >>
> > >>
> > >>
> > >>   The proposal introduces both:
> > >>
> > >>   - Global view: `fluss_catalog.sys.lake_tiering_status`
> > >>
> > >>   - Per-table view: `my_db.my_table$tiering_status`
> > >>
> > >>
> > >>
> > >>   I was wondering if we could simplify the initial implementation by
> > >> focusing on
> > >>   the per-table `$tiering_status` virtual table for SQL access, while
> > >> relying on
> > >>   The `listTieringStatuses()` Admin API for bulk/system-wide queries.
> > >>
> > >>
> > >>
> > >>   My reasoning:
> > >>
> > >>   - Consistency: The per-table pattern (`$tiering_status`) aligns with
> > >> Fluss's
> > >>     existing virtual table conventions and is similar to the virtual
> > table
> > >> approach with
> > >>     `$changelog`, `$binlog`, etc.
> > >>
> > >>   - Scalability: A global SQL table querying thousands of tables could
> > >> have
> > >>
> > >>     performance implications. The Admin API seems better suited for
> bulk
> > >> operations
> > >>     with potential pagination support.
> > >>
> > >>
> > >>
> > >>
> > >> A phased approach (Phase 1: per-table SQL, Phase 2: Admin API) could
> > ship
> > >> value to users faster with reduced initial scope.
> > >>
> > >> That said, I may be underestimating the need for the global SQL table.
> > Are
> > >> there specific use cases that would be difficult to serve with just
> the
> > >> Admin API?
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>  2. Error Message Truncation Strategy
> > >>
> > >>
> > >>
> > >>   The proposal mentions truncating error messages to 2-4KB before
> > sending
> > >> to the
> > >>   Coordinator. I have a concern about simple head truncation
> potentially
> > >> removing
> > >>   the most useful diagnostic information.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>   Are we considering an extraction strategy to deal with it, in my
> mind,
> > >> something like this?
> > >>
> > >>
> > >>   - Smart extraction: Parse and extract all "Caused by:" lines, which
> > >> typically
> > >>     contain the most actionable information
> > >>
> > >>
> > >>
> > >>   I understand this adds complexity, so it's a trade-off. Curious to
> > hear
> > >> others'
> > >>   thoughts on whether this is worth addressing.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>   3. Consolidating State Maps in LakeTableTieringManager
> > >>
> > >>
> > >>
> > >>   The proposal adds `tieringFailMessages` and `tieringFailTimes` maps
> to
> > >>
> > >>   `LakeTableTieringManager`. Looking at the current implementation,
> the
> > >> manager
> > >>   already maintains 6+ separate maps keyed by `tableId`:
> > >>
> > >>
> > >>
> > >>   ```java
> > >>
> > >>   Map<Long, TieringState> tieringStates;
> > >>
> > >>   Map<Long, TablePath> tablePaths;
> > >>
> > >>   Map<Long, Long> tableLakeFreshness;
> > >>
> > >>   Map<Long, Long> tableTierEpoch;
> > >>
> > >>   Map<Long, Long> tableLastTieredTime;
> > >>
> > >>   Map<Long, Long> liveTieringTableIds;
> > >>
> > >>   // Proposed additions:
> > >>
> > >>   Map<Long, String> tieringFailMessages;
> > >>
> > >>   Map<Long, Long> tieringFailTimes;
> > >>
> > >>
> > >>
> > >>   One thought: would it be cleaner to consolidate these into a single
> > >>
> > >>   TableTieringInfo object?
> > >>
> > >>
> > >>
> > >>   Map<Long, TableTieringInfo> tableInfos;
> > >>
> > >>
> > >>
> > >>   class TableTieringInfo {
> > >>
> > >>       TablePath tablePath;
> > >>
> > >>       long lakeFreshness;
> > >>
> > >>       TieringState state;
> > >>
> > >>       long tieringEpoch;
> > >>
> > >>       long lastTieredTime;
> > >>
> > >>       @Nullable String lastError;
> > >>
> > >>       @Nullable Long lastErrorTime;
> > >>
> > >>   }
> > >>
> > >>
> > >>
> > >>   Potential benefits:
> > >>
> > >>   - Single map lookup instead of multiple
> > >>
> > >>   - Related state updated together naturally
> > >>
> > >>   - Cleaner cleanup in removeLakeTable() (one removal vs. 8)
> > >>
> > >>
> > >>
> > >>
> > >>   This could be a separate preparatory refactoring PR or part of
> FIP-30.
> > >> However,
> > >>   I understand this might be out of scope for this FIP, and I don't
> want
> > >> to
> > >> expand
> > >>   the scope unnecessarily. Just raising it as a thought for the
> authors
> > to
> > >> consider.
> > >>
> > >>
> > >>
> > >>   These are just suggestions based on my reading of the proposal. I'm
> > >> happy
> > >> to be
> > >>   corrected if I've misunderstood anything. Also happy to help with
> > >> implementation or further discussion if useful.
> > >>
> > >>
> > >>
> > >>   Thanks again for driving this important feature!
> > >>
> > >>
> > >>
> > >>   Best regards,
> > >>
> > >>   Mehul Batra
> > >>
> > >> On Thu, Feb 12, 2026 at 5:53 PM SeungMin Lee <[email protected]>
> wrote:
> > >>
> > >> > Hi dev,
> > >> >
> > >> > Just a quick update.
> > >> >
> > >> > I have migrated the design google docs to the cwiki and registered
> it
> > as
> > >> > *FIP-30*. Please refer to the link below for the formal proposal:
> > >> >
> > >> >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-30%3A+Support+tracking+the+tiering+status+of+a+tiering+table
> > >> >
> > >> > The content remains consistent with the previous Google Doc.
> > >> >
> > >> > Best regards,
> > >> > SeungMin Lee
> > >> >
> > >> > 2026년 2월 12일 (목) PM 5:37, SeungMin Lee <[email protected]>님이 작성:
> > >> > >
> > >> > > Hi, dev
> > >> > >
> > >> > > Currently, there is no way for users to check the status of lake
> > >> tiering.
> > >> > Users cannot be aware if tiering fails, and they have to manually
> > parse
> > >> the
> > >> > Tiering Service logs to identify the cause.
> > >> > >
> > >> > > So, I'd like to propose Issue-2362: Allow users to track the
> tiering
> > >> > status of a tiering table to address this visibility issue.
> > >> > >
> > >> > > I have drafted a design docs [2]. Please feel free to review and
> > share
> > >> > your feed.
> > >> > >
> > >> > > Considering the upcoming holidays in some regions, I'll wait for
> > >> feedback
> > >> > and give a ping on this thread around Feb 23rd.
> > >> > >
> > >> > > Looking forward to your thoughts.
> > >> > >
> > >> > > Best regards,
> > >> > > SeungMin Lee
> > >> > >
> > >> > > [1] https://github.com/apache/fluss/issues/2362
> > >> > > [2]
> > >> >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1eJbRCwzAbeJLA97zQQ0I3JM1jerBXXhq69Dn8r4xWV0/edit?usp=sharing
> > >> >
> > >>
> > >
> >
>

Re: [DISCUSS] Allow user to track the tiering status of a tiering table

Reply via email to