Parallelizing a tumbling group window

Colin Williams Fri, 08 Dec 2017 18:07:46 -0800

Hello,

I've inherited some flink application code.


We're currently creating a table using a Tumbling SQL query similar to the
first example in

 https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/table/sql.
html#group-windows

Where each generated SQL query looks something like

SELECT measurement, `tag_AppId`(tagset), P99(`field_Adds`(fieldset)),
TUMBLE_START(rowtime, INTERVAL '10' MINUTE) FROM LINES GROUP BY
measurement, `tag_AppId`(tagset), TUMBLE(rowtime, INTERVAL '10' MINUTE)

We are also using a UDFAGG function in some of the queries which I think
might be cleaned up and optimized a bit (using scala types and possibly not
well implemented)

We then turn the result table back into a datastream using toAppendStream,
and eventually add a derivative stream to a sink. We've configured
TimeCharacteristic to event-time processing.

In some streaming scenarios everything is working fine with a parallelism
of 1, but in others it appears that we can't keep up with the event source.

Then we are investigating how to enable parallelism specifically on the SQL
table query or aggregator.

Can anyone suggest a good way to go about this? It wasn't clear from the
documentation.

Best,

Colin Williams

Parallelizing a tumbling group window

Reply via email to