[jira] [Commented] (BEAM-1316) DoFn#startBundle and #finishBundle should not be able to output
[ https://issues.apache.org/jira/browse/BEAM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840324#comment-15840324 ] Kenneth Knowles commented on BEAM-1316: --- That is a good example that breaks my false dichotomy. I think between BEAM-1283 and BEAM-1287 there can be a coherent solution. But even with all the planned changes if you want decently windowed output you'll end up tracking it yourself, unless we introduce per-window finishBundle/flush (which was frowned upon some time ago, but maybe makes sense here). > DoFn#startBundle and #finishBundle should not be able to output > --- > > Key: BEAM-1316 > URL: https://issues.apache.org/jira/browse/BEAM-1316 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh > > While within startBundle and finishBundle, the window in which elements are > output is not generally defined. Elements must always be output from within a > windowed context, or the {{WindowFn}} used by the {{PCollection}} may not > operate appropriately. > startBundle and finishBundle are suitable for operational duties, similarly > to {{setup}} and {{teardown}}, but within the scope of some collection of > input elements. This includes actions such as clearing field state within a > DoFn and ensuring all live RPCs complete successfully before committing > inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1316) DoFn#startBundle and #finishBundle should not be able to output
[ https://issues.apache.org/jira/browse/BEAM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840014#comment-15840014 ] Daniel Halperin commented on BEAM-1316: --- What if my output includes a list of filenames paired with file sizes, or element counts -- aka, information that may only be known after I flush to external systems? > DoFn#startBundle and #finishBundle should not be able to output > --- > > Key: BEAM-1316 > URL: https://issues.apache.org/jira/browse/BEAM-1316 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh > > While within startBundle and finishBundle, the window in which elements are > output is not generally defined. Elements must always be output from within a > windowed context, or the {{WindowFn}} used by the {{PCollection}} may not > operate appropriately. > startBundle and finishBundle are suitable for operational duties, similarly > to {{setup}} and {{teardown}}, but within the scope of some collection of > input elements. This includes actions such as clearing field state within a > DoFn and ensuring all live RPCs complete successfully before committing > inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1316) DoFn#startBundle and #finishBundle should not be able to output
[ https://issues.apache.org/jira/browse/BEAM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840002#comment-15840002 ] Kenneth Knowles commented on BEAM-1316: --- I suggest very explicitly distinguishing flushing to external systems from flushing output to a PCollection. FinishBundle works for the former but for the latter requires significant contortions to be correct and even then won't do what you want when there are many small bundles. > DoFn#startBundle and #finishBundle should not be able to output > --- > > Key: BEAM-1316 > URL: https://issues.apache.org/jira/browse/BEAM-1316 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh > > While within startBundle and finishBundle, the window in which elements are > output is not generally defined. Elements must always be output from within a > windowed context, or the {{WindowFn}} used by the {{PCollection}} may not > operate appropriately. > startBundle and finishBundle are suitable for operational duties, similarly > to {{setup}} and {{teardown}}, but within the scope of some collection of > input elements. This includes actions such as clearing field state within a > DoFn and ensuring all live RPCs complete successfully before committing > inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1316) DoFn#startBundle and #finishBundle should not be able to output
[ https://issues.apache.org/jira/browse/BEAM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838870#comment-15838870 ] Daniel Halperin commented on BEAM-1316: --- I think one many need to output in finish bundle using the current "buffer, and flush half-full if this is the end of the bundle" pattern. > DoFn#startBundle and #finishBundle should not be able to output > --- > > Key: BEAM-1316 > URL: https://issues.apache.org/jira/browse/BEAM-1316 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh > > While within startBundle and finishBundle, the window in which elements are > output is not generally defined. Elements must always be output from within a > windowed context, or the {{WindowFn}} used by the {{PCollection}} may not > operate appropriately. > startBundle and finishBundle are suitable for operational duties, similarly > to {{setup}} and {{teardown}}, but within the scope of some collection of > input elements. This includes actions such as clearing field state within a > DoFn and ensuring all live RPCs complete successfully before committing > inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1316) DoFn#startBundle and #finishBundle should not be able to output
[ https://issues.apache.org/jira/browse/BEAM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838761#comment-15838761 ] Thomas Groh commented on BEAM-1316: --- Forbidding output from startBundle and finishBundle brings the contexts received by them in line with setup and teardown > DoFn#startBundle and #finishBundle should not be able to output > --- > > Key: BEAM-1316 > URL: https://issues.apache.org/jira/browse/BEAM-1316 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh > > While within startBundle and finishBundle, the window in which elements are > output is not generally defined. Elements must always be output from within a > windowed context, or the {{WindowFn}} used by the {{PCollection}} may not > operate appropriately. > startBundle and finishBundle are suitable for operational duties, similarly > to {{setup}} and {{teardown}}, but within the scope of some collection of > input elements. This includes actions such as clearing field state within a > DoFn and ensuring all live RPCs complete successfully before committing > inputs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)