[
https://issues.apache.org/jira/browse/BEAM-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-9198:
----------------------------------
This Jira ticket has a pull request attached to it, but is still open. Did the
pull request resolve the issue? If so, could you please mark it resolved? This
will help the project have a clear view of its open issues.
> BeamSQL aggregation analytics functionality
> --------------------------------------------
>
> Key: BEAM-9198
> URL: https://issues.apache.org/jira/browse/BEAM-9198
> Project: Beam
> Issue Type: New Feature
> Components: dsl-sql
> Reporter: Rui Wang
> Priority: P3
> Labels: gsoc, gsoc2020, mentor
> Time Spent: 10h
> Remaining Estimate: 0h
>
> Mentor email: [email protected]. Feel free to send emails for your questions.
> Project Information
> ---------------------
> BeamSQL has a long list of of aggregation/aggregation analytics
> functionalities to support.
> To begin with, you will need to support this syntax:
> {code:sql}
> analytic_function_name ( [ argument_list ] )
> OVER (
> [ PARTITION BY partition_expression_list ]
> [ ORDER BY expression [{ ASC | DESC }] [, ...] ]
> [ window_frame_clause ]
> )
> {code}
> As there is a long list of analytics functions, a good start point is support
> rank() first.
> This will requires touch core components of BeamSQL:
> 1. SQL parser to support the syntax above.
> 2. SQL core to implement physical relational operator.
> 3. Distributed algorithms to implement a list of functions in a distributed
> manner.
> 4. Enable in ZetaSQL dialect.
> To understand what SQL analytics functionality is, you could check this great
> explanation doc:
> https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts.
> To know about Beam's programming model, check:
> https://beam.apache.org/documentation/programming-guide/#overview
--
This message was sent by Atlassian Jira
(v8.20.1#820001)