Hi everyone, I’m using a Kafka source with a lot of watermark skew (i.e. new partitions were added to the topic over time). The sink is a FileIO.Write().withNumShards(1) to get ~ 1 file per day & an early trigger to write at most 40,000 records per file. Unfortunately it looks like there's only 1 shard trying to write files for all the various days instead of writing multiple days' files in parallel. (The way I'm checking this is via the Flink UI). Is there anything I could do here to parallelize the process? All of this is with the Flink runner.
Mike. Ladder <http://bit.ly/1VRtWfS>. The smart, modern way to insure your life.
