Hi Yegor,

If you want to use Flink's keyed windowing logic, then you need to insert a
keyBy/shuffle operation because Flink currently cannot simply use the
partitioning of the Kinesis shards. The reason is that Flink needs to group
the keys into the correct key groups in order to support rescaling of the
state.

What you can do, though, is to create a custom operator or use a flatMap to
build your own windowing operator. This operator could then use the
partitioning of the Kinesis shards by simply collecting the events until
either 30 seconds or 1000 events are observed.

Cheers,
Till

On Wed, Apr 28, 2021 at 11:12 AM Yegor Roganov <yegor....@gmail.com> wrote:

> Hello
>
> To learn Flink I'm trying to build a simple application where I want to
> save events coming from Kinesis to S3.
> I want to subscribe to each shard, and within each shard I want to batch
> for 30 seconds, or until 1000 events are observed. These batches should
> then be uploaded to S3.
> What I don't understand is how to key my source on shard id, and do it in
> a way that doesn't induce unnecessary shuffling.
> Is this possible with Flink?
>

Reply via email to