Thanks for digging into this. Regarding this query: INSERT INTO the_table SELECT window_end, COUNT(*) FROM (TUMBLE(TABLE interactions, DESCRIPTOR(ts), INTERVAL '5' MINUTES)) GROUP BY window_end HAVING now() - window_end <= INTERVAL '14' DAYS;
I am not sure I understand what the conclusion is on the data retention question, where the continuous streaming SQL query has retention semantics. I think we would need to answer the following questions (I will call the query that computed the managed table the "view materializer query" - VMQ). (1) I guess the VMQ will send no updates for windows beyond the "retention period" is over (14 days), as you said. That makes sense. (2) Will the VMQ send retractions so that the data will be removed from the table (via compactions)? - if yes, this seems semantically better for users, but it will be expensive to keep the timers for retractions. - if not, we can still solve this by adding filters to queries against the managed table, as long as these queries are in Flink. - any subscriber to the changelog stream would not see strictly a correct result if we are not doing the retractions (3) Do we want time retention semantics handled by the compaction? - if we say that we lazily apply the deletes in the queries that read the managed tables, then we could also age out the old data during compaction. - that is cheap, but it might be too much of a special case to be very relevant here. (4) Do we want to declare those types of queries "out of scope" initially? - if yes, how many users are we affecting? (I guess probably not many, but would be good to hear some thoughts from others on this) - should we simply reject such queries in the optimizer as "not possible to support in managed tables"? I would suggest that, always better to tell users exactly what works and what not, rather than letting them be surprised in the end. Users can still remove the HAVING clause if they want the query to run, and that would be better than if the VMQ just silently ignores those semantics. Thanks, Stephan