Hi again, Maybe you can use the https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-sink-keyed-shuffle *table.exec.sink.keyed-shuffle* and set it to *FORCE, *which will use the primary key column(s) to partition and distribute the data.
On Fri, Apr 1, 2022 at 6:52 PM Marios Trivyzas <mat...@gmail.com> wrote: > Hi! > > I don't think there is a way to achieve that without resorting to > DataStream API. > I don't know if using the PARTITIONED BY clause in the create statement of > the table can help to "balance" the data, see > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#partitioned-by > . > > > On Thu, Mar 31, 2022 at 7:18 AM Yaroslav Tkachenko <yaros...@goldsky.io> > wrote: > >> Hey everyone, >> >> I'm trying to use Flink SQL to construct a set of transformations for my >> application. Let's say the topology just has three steps: >> >> - SQL Source >> - SQL SELECT statement >> - SQL Sink (via INSERT) >> >> The sink I'm using (JDBC) would really benefit from data partitioning (by >> PK ID) to avoid conflicting transactions and deadlocks. I can force Flink >> to partition the data by the PK ID before the INSERT by resorting to >> DataStream API and leveraging the keyBy method, then transforming >> DataStream back to the Table again... >> >> Is there a simpler way to do this? I understand that, for example, a >> GROUP BY statement will probably perform similar data shuffling, but what >> if I have a simple SELECT followed by INSERT? >> >> Thank you! >> > > > -- > Marios > Best, Marios