arunmahadevan edited a comment on issue #23576: [SPARK-26655] [SS] Support multiple aggregates in append mode URL: https://github.com/apache/spark/pull/23576#issuecomment-461542627 > * I have a plan tree A: EventTimeExec -> B: StatefulOperator -> C: StatefulOperator. Can C use the watermark in A? If so, is it safe to do that when B transforms or projects away the watermarked column - if not, what are the rules for how watermarks must be provided with multiple aggregates? Typically C cannot since A is the input watermark of B and assuming it does some aggregation, it needs to emit a new watermark. Theres a new check in the `UnsupportedOperationChecker` where it checks that each aggregate's grouping expression has a event time watermark attribute, which kind of enforces this. So one would have to explicitly specify a timestamp output column and a second watermark like ```java input.withWatermark("ts", ...) .groupBy(window($"ts", ...), $"key").count() .select($"window.end" as "windowts", $"count") .withWatermark("windowts", ...) .groupBy(window($"windowts", ...), $"count").count() ``` > * Do all of our optimization and execution rules respect the semantics of operator watermarks? Need to check if it would interfere with multiple watermarks or we need any new rules. >* We can currently call `withWatermark` at any point in the query plan. Is this consistent with operator watermarks? Even if we can support the two of them together, do we want to? I thought `withWatermark` should be called before the `groupBy` so that the grouping attribute will have a watermark otherwise it fails in the `UnsupportedOperationChecker`. With multiple aggregates, it should be called before each aggregate.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
