Hi dev, Hope you had a refreshing break.
Touching base on FIP-30. I'm aiming to wrap up the feedback process by the week the 0.9 release vote <https://lists.apache.org/thread/3c8w6ofrssjxrpvz85pkm2n2kx1gyzxd> ends, so we can stay aligned with the project timeline. Also, hope the 0.9 release vote <https://lists.apache.org/thread/3c8w6ofrssjxrpvz85pkm2n2kx1gyzxd> gets plenty of interest as well. Looking forward to your thoughts. Best regards, SeungMin Lee 2026년 2월 15일 (일) AM 12:43, SeungMin Lee <[email protected]>님이 작성: > Hi Mehul Batra, > > First of all, thank you very much for the detailed review and valuable > suggestions. I really appreciate your insights. > > *1. Per-Table System Table vs Global System Table* > I think, the use case for the global view is to easily integrate with > monitoring tools like grafana. Without a sql interface, users have to build > a custom exporter using Admin API to monitor the tiering status of all > tables. I do share your concerns regarding the performance impact when > querying thousands of tables. While I acknowledge the potential performance > risks in massive clusters, I believe it’s better to provide full visibility > first. We can monitor real-world performance data and, if necessary, > implement safeguards like implicit limits or forced LIMIT clauses as a > follow-up optimization. > > > *2. Error Message Truncation Strategy* > It is a great point. Simply truncating the head of the error message might > indeed cut off some important information. I agree with your suggestion > "Smart extraction" that prioritizes the phrase near words like "*Caused > by*". To keep the initial FIP-30 scope focused, I plan to implement basic > truncation first. However, I would be very grateful if you could help with > the smart extraction as a follow-up pr if you have the capacity. > > > *3. Consolidating State Maps in LakeTableTieringManager* > I also fully agree with consolidating the maps in LakeTableTieringManager. > Looking at the code again, managing 7 separate maps (and soon 9) for each > table is getting a bit complicated. It’s quite easy to miss one map when > registering or removing tables, which could lead to bugs or small memory > leaks over time. Grouping everything into a single TableTieringInfo object > will make the logic much easier to follow and help keep all the metadata > consistent. Plus, it should be a bit more memory-efficient by reducing the > number of internal map nodes. I’ll definitely include this refactoring as > part of the FIP-30 implementation. > > > Thanks again for helping refine the design! > > Best Regards, > SeungMin Lee > > > 2026년 2월 14일 (토) AM 2:22, Mehul Batra <[email protected]>님이 작성: > >> Hi SeungMinLee, >> >> >> >> First of all, thank you for putting together FIP-30. The ability >> >> Tracking tiering status is a much-needed feature, and I appreciate the >> thorough >> design work that went into this proposal. >> >> >> >> After reviewing the FIP, I have a few thoughts and questions I'd like to >> raise >> for discussion. These are suggestions based on my understanding - I may >> be >> missing context, so please feel free to correct me if any of these >> points >> have >> already been considered. >> >> >> >> >> >> >> >> 1. Per-Table System Table vs Global System Table >> >> >> >> The proposal introduces both: >> >> - Global view: `fluss_catalog.sys.lake_tiering_status` >> >> - Per-table view: `my_db.my_table$tiering_status` >> >> >> >> I was wondering if we could simplify the initial implementation by >> focusing on >> the per-table `$tiering_status` virtual table for SQL access, while >> relying on >> The `listTieringStatuses()` Admin API for bulk/system-wide queries. >> >> >> >> My reasoning: >> >> - Consistency: The per-table pattern (`$tiering_status`) aligns with >> Fluss's >> existing virtual table conventions and is similar to the virtual table >> approach with >> `$changelog`, `$binlog`, etc. >> >> - Scalability: A global SQL table querying thousands of tables could >> have >> >> performance implications. The Admin API seems better suited for bulk >> operations >> with potential pagination support. >> >> >> >> >> A phased approach (Phase 1: per-table SQL, Phase 2: Admin API) could ship >> value to users faster with reduced initial scope. >> >> That said, I may be underestimating the need for the global SQL table. Are >> there specific use cases that would be difficult to serve with just the >> Admin API? >> >> >> >> >> >> 2. Error Message Truncation Strategy >> >> >> >> The proposal mentions truncating error messages to 2-4KB before sending >> to the >> Coordinator. I have a concern about simple head truncation potentially >> removing >> the most useful diagnostic information. >> >> >> >> >> >> >> Are we considering an extraction strategy to deal with it, in my mind, >> something like this? >> >> >> - Smart extraction: Parse and extract all "Caused by:" lines, which >> typically >> contain the most actionable information >> >> >> >> I understand this adds complexity, so it's a trade-off. Curious to hear >> others' >> thoughts on whether this is worth addressing. >> >> >> >> >> >> 3. Consolidating State Maps in LakeTableTieringManager >> >> >> >> The proposal adds `tieringFailMessages` and `tieringFailTimes` maps to >> >> `LakeTableTieringManager`. Looking at the current implementation, the >> manager >> already maintains 6+ separate maps keyed by `tableId`: >> >> >> >> ```java >> >> Map<Long, TieringState> tieringStates; >> >> Map<Long, TablePath> tablePaths; >> >> Map<Long, Long> tableLakeFreshness; >> >> Map<Long, Long> tableTierEpoch; >> >> Map<Long, Long> tableLastTieredTime; >> >> Map<Long, Long> liveTieringTableIds; >> >> // Proposed additions: >> >> Map<Long, String> tieringFailMessages; >> >> Map<Long, Long> tieringFailTimes; >> >> >> >> One thought: would it be cleaner to consolidate these into a single >> >> TableTieringInfo object? >> >> >> >> Map<Long, TableTieringInfo> tableInfos; >> >> >> >> class TableTieringInfo { >> >> TablePath tablePath; >> >> long lakeFreshness; >> >> TieringState state; >> >> long tieringEpoch; >> >> long lastTieredTime; >> >> @Nullable String lastError; >> >> @Nullable Long lastErrorTime; >> >> } >> >> >> >> Potential benefits: >> >> - Single map lookup instead of multiple >> >> - Related state updated together naturally >> >> - Cleaner cleanup in removeLakeTable() (one removal vs. 8) >> >> >> >> >> This could be a separate preparatory refactoring PR or part of FIP-30. >> However, >> I understand this might be out of scope for this FIP, and I don't want >> to >> expand >> the scope unnecessarily. Just raising it as a thought for the authors to >> consider. >> >> >> >> These are just suggestions based on my reading of the proposal. I'm >> happy >> to be >> corrected if I've misunderstood anything. Also happy to help with >> implementation or further discussion if useful. >> >> >> >> Thanks again for driving this important feature! >> >> >> >> Best regards, >> >> Mehul Batra >> >> On Thu, Feb 12, 2026 at 5:53 PM SeungMin Lee <[email protected]> wrote: >> >> > Hi dev, >> > >> > Just a quick update. >> > >> > I have migrated the design google docs to the cwiki and registered it as >> > *FIP-30*. Please refer to the link below for the formal proposal: >> > >> > >> > >> https://cwiki.apache.org/confluence/display/FLUSS/FIP-30%3A+Support+tracking+the+tiering+status+of+a+tiering+table >> > >> > The content remains consistent with the previous Google Doc. >> > >> > Best regards, >> > SeungMin Lee >> > >> > 2026년 2월 12일 (목) PM 5:37, SeungMin Lee <[email protected]>님이 작성: >> > > >> > > Hi, dev >> > > >> > > Currently, there is no way for users to check the status of lake >> tiering. >> > Users cannot be aware if tiering fails, and they have to manually parse >> the >> > Tiering Service logs to identify the cause. >> > > >> > > So, I'd like to propose Issue-2362: Allow users to track the tiering >> > status of a tiering table to address this visibility issue. >> > > >> > > I have drafted a design docs [2]. Please feel free to review and share >> > your feed. >> > > >> > > Considering the upcoming holidays in some regions, I'll wait for >> feedback >> > and give a ping on this thread around Feb 23rd. >> > > >> > > Looking forward to your thoughts. >> > > >> > > Best regards, >> > > SeungMin Lee >> > > >> > > [1] https://github.com/apache/fluss/issues/2362 >> > > [2] >> > >> > >> https://docs.google.com/document/d/1eJbRCwzAbeJLA97zQQ0I3JM1jerBXXhq69Dn8r4xWV0/edit?usp=sharing >> > >> >
