polyzos commented on PR #1702:
URL: https://github.com/apache/fluss/pull/1702#issuecomment-3305439761

   @platinumhamburg Thank you so much for all the detailed review and all this 
great feedback you have provided, which is more than great.
   
   Some context here: the goal was to use this feature for realtime dashboard 
use cases - realtime user tracking, fleet management etc. where the key space 
is typically smaller due to visualization needs - i.e from really a few rows up 
to a few thousands with something ~1-5k as a threshold. At first i was thinking 
to make the threshold configurable to throw an exception and not allow the user 
to use this feature (to avoid OoO for large tables). Eventually I  thought of 
adding just a note in the docs, but seems like its better to add this.
   
   Similarly for the API design, I was really reluctant in terms of the naming 
and where the api should live. I thought the scanner at first, but on my mind 
its more about unbounded / continuous data. The lookuper is about lookups on 
the kv table, and i thought of treating it as a "whole database" lookups.
   
   Moreover because the intention was to return a full snapshot of the table as 
is - i.e its an aggregated table with the required columns, column pruning and 
predicate pushdown where not really taken into account. And same for the 
partition pruning because on the latest partition should have the latest 
desired values.
   
   Seems like a made quite a few assumption in terms of correct usage! Let me 
know your thoughts on the above and I will close this PR and proceed with 
creating a design proposal first. 🙏 Again thank you so much for all this great 
feedback 🙇‍♂️ 
   
   cc @wuchong 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to