Hi, Marco ~ It seems what you need is a temporal join from the SQL side, you can define 2 Flink tables for your PostgreSQL ones and join your Kafka stream with them [1][3].
Flink 1.10 also supports this. There is some difference with the DDL compared to 1.11 [2] [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/jdbc.html#how-to-create-a-jdbc-table [2] https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/table/connect.html#jdbc-connector [3] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/streaming/temporal_tables.html#temporal-table Best, Danny Chan 在 2020年8月5日 +0800 AM4:34,Marco Villalobos <mvillalo...@kineteque.com>,写道: > Lets say that I have: > > SQL Query One from data in PostgreSQL (200K records). > SQL Query Two from data in PostgreSQL (1000 records). > and Kafka Topic One. > > Let's also say that main data from this Flink job arrives in Kafka Topic One. > > If I need SQL Query One and SQL Query Two to happen just one time, when the > job starts up, and afterwards maybe store it in Keyed State or Broadcast > State, but it's not really part of the stream, then what is the best practice > for supporting that in Flink > > The Flink job needs to stream data from Kafka Topic One, aggregate it, and > perform computations that require all of the data in SQL Query One and SQL > Query Two to perform its business logic. > > I am using Flink 1.10. > > I supposed to query the database before the Job I submitted, and then pass it > on as parameters to a function? > Or am I supposed to use JDBCInputFormat for both queries and create two > streams, and somehow connect or broadcast both of them two the main stream > that uses Kafka Topic One? > > I would appreciate guidance. Please. Thank you. > > Sincerely, > > Marco A. Villalobos > > >