[GitHub] [beam] kennknowles opened a new issue, #19023: Enhanced LIMIT support

GitBox Fri, 03 Jun 2022 15:03:56 -0700


kennknowles opened a new issue, #19023:
URL: https://github.com/apache/beam/issues/19023


   Currently, Beam SQL supports LIMIT in two ways:
   
   1. Within a query, the results are subject to LIMIT. This works.
   2. The shell knows to cancel a pipeline when the limit is reached, even if 
there is unfinished unbounded data.
   
   The canceling of a pipeline works via a basic pattern match against the 
query execution plan, checking a few child nodes of the BeamEnumerableConverter 
for a BeamSortRel without a collation. If it can figure out what the limit is 
for the outermost query, then it will cancel the pipeline.
   
   A more robust approach might be to use traits (or some other thorough 
analysis) to see if there is a known size for the outermost query. This would, 
for example, be unaffected by any number of layer of non-size-changing 
transformations.
   
   Imported from Jira 
[BEAM-4719](https://issues.apache.org/jira/browse/BEAM-4719). Original Jira may 
contain additional context.
   Reported by: kenn.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] kennknowles opened a new issue, #19023: Enhanced LIMIT support

Reply via email to