Hi Anton, thanks for the review, These are great thoughts worth thinking through carefully. I want to engage with each point directly, but also raise a concern about where I think these is a bar for a blocking this implementation given where we are in the process of the vote.
*On the session lifecycle machinery being a burden* The ScannerManager with TTL eviction, per-bucket/server limits, leadership-change cleanup is pretty straightforward. Fluss already carries analogous lifecycle machinery for log fetch sessions and snapshot resource leases. I don't think the operational cost here is qualitatively different from what we already manage. It is well-understood, it has tests, and the configuration surface is minimal (kv.scanner.ttl, kv.scanner.max-per-bucket, kv.scanner.max-per-server). I would not call the presence of this machinery a design flaw. *On the continuation-token / Cassandra-style alternative* This is a great suggestion and I want to give it a fair hearing. You are correct that an opaque server-produced token shifts the client-side contract in a nice way, the client never constructs or interprets position bytes, it just echoes them back. This could be a potential ergonomic improvement. However, for Fluss's primary scan use cases the token approach has a non-trivial correctness cost: The log_offset we return on the first response is a commitment: this scan reflects KV state as of this log position. That guarantee is what makes the CDC bootstrap pattern correct, since you read all KV rows, then replay the log from log_offset, and you know the two halves are consistent. A purely stateless token approach (fresh snapshot per page) breaks this guarantee. You cannot claim a log_offset that covers a scan made of pages from different snapshots. A hybrid — keep the snapshot alive, drop the iterator — preserves the guarantee but still requires server-side state. You have traded the iterator for a pinned snapshot; the ScannerManager does not go away, it just manages slightly fewer bytes per session. The seek cost per batch is added on top. So Option B is not strictly simpler, it is roughly the same complexity as Option A for the use cases that matter, with an added seek per batch and without the live iterator's natural progress tracking. Option C (fully stateless, no snapshot isolation) is simpler, and it is a valid approach for bulk export or admin tooling. However I don't think it is appropriate as the primary mode for this FIP's target use cases. *On retries* You are correct that a mid-scan leader failover cannot be recovered transparently under the current design. the session is gone and the client must restart. This is can be a limitation. But I would argue it is not a correctness problem: it is consistent with how Flink handles source restarts in general (replay from checkpoint). The callSeqId mechanism handles the more common case. transient network failure with the same leader alive — cleanly. Improving the failover story is a good follow-up item. A resume_hint field on the response (the last key served, opaque bytes) would let a client that detects session loss open a new session and skip already-processed rows, without changing the core protocol. This is purely additive and does not require redesigning the session model. *On the FIP needing an alternatives comparison* This could indeed be useful, however given the time this FIP has been opened and that the VOTE closes today and since this is a user-facing feature for which we don't know the exact value yet, until we get it out these for users to use, I would prefer to close the VOTE, get something out there and iterate. Both you and Lorenzo raised are real engineering tradeoffs, and I appreciate them being surfaced. But I do not think any of them represent a correctness flaw in the current design, a safety hazard, or a decision that forecloses future evolution. The live-session model is sound for the target use cases, the machinery is bounded and tested, and the points raised are things we can iterate on. Adding the alternatives section to the FIP addresses the most concrete ask. I would like to proceed with the vote on that basis. If anyone has a specific concern they believe is blocking like a correctness issue, a protocol commitment that we cannot evolve away from, or an operational risk we haven't accounted for, then please leave your VOTE as -1 and we can park this for now. Let me know your thoughts, and as i mentioned proceed with the Vote since it is due today and we can park it if you think these are indeed blocking issues. Best, Giannis
