lukecwik edited a comment on pull request #12603: URL: https://github.com/apache/beam/pull/12603#issuecomment-695156605
@iemejia I figured out that the issue is that watermark holds aren't implemented for spark so the first batch completes which computes new watermarks so the watermark hold that was set by the splittable dofn implementation is ignored. This leads to timers being dropped and hence only some of the results being produced. This is also the likely cause for why the PAssert is dropping the elements that were produced as well but I haven't validated this yet. Can you explain how the GlobalWatermarkHolder works, can I register anything as a `sourceId`? Since watermark holds don't seem to be implemented, does the GroupAlsoViaWindowSet hold back the watermark for elements that it currently has buffered? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
