kennknowles opened a new issue, #19023: URL: https://github.com/apache/beam/issues/19023
Currently, Beam SQL supports LIMIT in two ways: 1. Within a query, the results are subject to LIMIT. This works. 2. The shell knows to cancel a pipeline when the limit is reached, even if there is unfinished unbounded data. The canceling of a pipeline works via a basic pattern match against the query execution plan, checking a few child nodes of the BeamEnumerableConverter for a BeamSortRel without a collation. If it can figure out what the limit is for the outermost query, then it will cancel the pipeline. A more robust approach might be to use traits (or some other thorough analysis) to see if there is a known size for the outermost query. This would, for example, be unaffected by any number of layer of non-size-changing transformations. Imported from Jira [BEAM-4719](https://issues.apache.org/jira/browse/BEAM-4719). Original Jira may contain additional context. Reported by: kenn. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
