[email protected] The location of the GroupByKey/CoGroupByKey is where all windowing information is being buffered before being fired by a trigger. The windowing strategy is: Window<KV<String, String>> window = Window.<KV<String, String>>into(FixedWindows.of(Duration.standardSeconds(10))) .withAllowedLateness(Duration.standardSeconds(5)) .accumulatingFiredPanes() .triggering(Never.ever());
Is there an issue preventing proper GC? On Tue, Dec 19, 2017 at 8:53 AM, Seth Albanese < [email protected]> wrote: > I’m running Beam 2.2.0 on Flink 1.3 using KafkaIO. Reading from two > topics, applying a fixed window, joining via a CoGrouByKey, and outputting > to another topic. > > Example code that reproduces the issue can be seen here: > https://gist.github.com/salbanese/c46df2718c09a897e04d498c3f59d9d7 > > When I cancel the job with a save point via flink cancel –s > /path/to/savepoint job-id, then restart the job from that save point via > flink run –s /path/to/savepoint –c … each subsequent save point grows in > size by about 30 percent or so, and eventually flink starts timing out and > fails to cancel the job. Eliminating the CoGroupByKey seems to stop this > behavior, and save points are consistent from one run to the next. > > I feel like I must be missing something. Any advice would be appreciated. > > Thanks > -seth >
