Even if not considering CombineFn, but just thinking about the aggregation function itself, I find it hard to model an aggregation function with more than one column (though aggregation function can have multiple non-aggregate parameters).
The pair idea (in SQL it could be a struct) makes more sense. You can pre-construct a column with struct type that contains the data. Then apply a UDAF on that column. -Rui On Fri, Jan 8, 2021 at 8:52 AM Dan Pettersson <dan.pettersso...@gmail.com> wrote: > Hi everyone, > > I'm quite new to Beam but I'm impressed by the work that you have done > on the analytics functions and MATCH_RECOGNIZE like the last 6 months or > so. > > I've been reading through the website about UDF and UDAF but I can't find > an answer to my question, which is if one can combine more than two > parameters into a UD-/AF? > I'm working in finance where for example VWAP on time series data is an > important metric > and for that metric you need both the price and the volume to get an > accurate result. > > I've looked into > https://github.com/GoogleCloudPlatform/dataflow-sample-applications/tree/master/timeseries-streaming > which is a nice library for this kind of logic-/metrics, but I still think > that SQL could be a better fit for this kind of financial calculations. > > Therefore I wonder if it is possible to create a UD-/AF that takes for > example volume and price as two parameters and via a window function create > an accumulator that gives back the result? > My question is therefore if you can use a PAIR<volume, price> in an SQL > statement to overcome the limitation that CombineFn only takes one input > parameter? > > Like this: > > SELECT account, vwap(trans.price, trans.volume) > FROM transactions AS trans > > or is one restricted to only use one inputParam in Beam SQL? > > Thanks in advance, > > /Dan >