ulysses-you commented on pull request #2225: URL: https://github.com/apache/incubator-kyuubi/pull/2225#issuecomment-1079884217
> Can ForcedMaxOutputRowsRule be replaced by this one It's different in physical side. `dataset.take` just like the incremental mode which run job partition by partition if previous partitions data has not satisfied the limit number. `dataset.limit.collect` will run all partitions to do local limit then do global limit in one partition. So in general `dataset.limit.collect` is more effective. And this PR is useful. Since the limitation is controlled by client side using Incremental mode, but we can control it at engine side using `dataset.take`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
