apilloud commented on PR #16911: URL: https://github.com/apache/beam/pull/16911#issuecomment-1090934645
The question is essentially what is the purpose of the ZetaSQL translation library. `AggregateScan` is a SQL query with a group by for example `select value*2, sum(key+1) from KeyValue group by value` This would give a groupByList of [value*2] and aggregate_list of [sum(key+1)]. The LogicalProject is the first step of simplifying the data, it basicly translates to a simple ParDo. We take the input data and apply any simple transforms, so you get: LogicalProject [value*2, key+1]. The input to the logical project may have other columns that are dropped. After the logical project you are left with a simpler SQL query: `SELECT col1, sum(col2) from LogicalProject group by col1`. In AnalyticScan, it appears the equivalent is `functionGroupList`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
