Even if not considering CombineFn, but just thinking about the
aggregation function itself, I find it hard to model an aggregation
function with more than one column (though aggregation function can have
multiple non-aggregate parameters).

The pair idea (in SQL it could be a struct) makes more sense. You can
pre-construct a column with struct type that contains the data. Then apply
a UDAF on that column.


-Rui

On Fri, Jan 8, 2021 at 8:52 AM Dan Pettersson <dan.pettersso...@gmail.com>
wrote:

> Hi everyone,
>
> I'm quite new to Beam but I'm impressed by the work that you have done
> on the analytics functions and MATCH_RECOGNIZE like the last 6 months or
> so.
>
> I've been reading through the website about UDF and UDAF but I can't find
> an answer to my question, which is if one can combine more than two
> parameters into a UD-/AF?
> I'm working in finance where for example VWAP on time series data is an
> important metric
> and for that metric you need both the price and the volume to get an
> accurate result.
>
> I've looked into
> https://github.com/GoogleCloudPlatform/dataflow-sample-applications/tree/master/timeseries-streaming
> which is a nice library for this kind of logic-/metrics, but I still think
> that SQL could be a better fit for this kind of financial calculations.
>
> Therefore I wonder if it is possible to create a UD-/AF that takes for
> example volume and price as two parameters and via a window function create
> an accumulator that gives back the result?
> My question is therefore if you can use a PAIR<volume, price> in an SQL
> statement to overcome the limitation that CombineFn only takes one input
> parameter?
>
> Like this:
>
> SELECT account, vwap(trans.price, trans.volume)
>    FROM transactions AS trans
>
> or is one restricted to only use one inputParam in Beam SQL?
>
> Thanks in advance,
>
> /Dan
>

Reply via email to