[ https://issues.apache.org/jira/browse/BEAM-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17122964#comment-17122964 ]
Beam JIRA Bot commented on BEAM-6399: ------------------------------------- This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3. Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean. > FileIO errors on unbounded input with nondefault trigger > -------------------------------------------------------- > > Key: BEAM-6399 > URL: https://issues.apache.org/jira/browse/BEAM-6399 > Project: Beam > Issue Type: Improvement > Components: io-java-files > Reporter: Jeff Klukas > Priority: P2 > Labels: stale-P2 > > {{In a pipeline with unbounded input, if a user defines a custom trigger and > does not specify a specific non-zero withNumShards, they may see an > IllegalArgumentException at runtime due to incompatible windows.}} > For example, consider this compound trigger: > {{Window.into(new GlobalWindows())}} > {{ .triggering(Repeatedly.forever(AfterFirst.of(}} > {{ AfterPane.elementCountAtLeast(10000),}} > {{ AfterProcessingTime.pastFirstElementInPane()}} > {{ .plusDelayOf(Duration.standardMinutes(10)))))}} > {{ .discardingFiredPanes()}} > > Using that windowing without specifying sharding yields: > > {{Inputs to Flatten had incompatible > triggers:}}{{Repeatedly.forever(AfterFirst.of(AfterPane.elementCountAtLeast(10000), > AfterProcessingTime.pastFirstElementInPane().plusDelayOf(1 > minute))),}}{{Repeatedly.forever(AfterFirst.of(AfterPane.elementCountAtLeast(1), > AfterSynchronizedProcessingTime.pastFirstElementInPane()))}} > > Without explicit sharding, WriteFiles creates both a sharded and unsharded > collection; the first goes through one GroupByKey while the other goes > through 2. These two collections are then flattened together and they have > incompatible triggers due to the double-grouped collection using a > continuation trigger. > > If the user instead specifies numShards, then a different code path is > followed that avoids this incompatibility. > > It looks like WriteFiles may need to be implemented differently to avoid > combining collections with potentially incompatible triggers. -- This message was sent by Atlassian Jira (v8.3.4#803005)