Re: [SQL] Reject unsupported inputs to Joins

2018-02-09 Thread Robert Bradshaw
Huge +1 to only allowing single-firing triggers through for sane
semantics in this case. Commented on the doc.

On Fri, Feb 9, 2018 at 5:08 PM, Anton Kedin  wrote:
> Hi,
>
> If you're not using Beam SQL JOINs, you're not affected.
>
> In Short
>
> Beam SQL JOIN operation does not work well with multiple trigger firings.
>
> More Detail
>
> Beam SQL JOIN is a CoGBK under the hood. It joins available elements
> per-pane. This means that:
>
> in discarding mode we're joining only new elements which arrived since last
> trigger firing, forgetting about any past elements;
> in accumulating mode in addition to joining new elements we're also emitting
> join results we have already emitted when trigger fired last time;
>
> This behavior cannot be configured or handled in pure SQL.
>
> Even More Detail
>
> Here
>
> Proposal
>
> Short term, allow only trigger configuration which we can reason about,
> rejecting anything else.
>
> For example we know that non-global windows with default trigger with zero
> allowed lateness will fire once per window. In this case we will join all
> elements in the window once after watermark is past end of window.
>
> Long term, retractions will allow correct handling of this situation.
>
> Jira, Pull Request
>
> Regards,
> Anton
>


[SQL] Reject unsupported inputs to Joins

2018-02-09 Thread Anton Kedin
Hi,

If you're not using Beam SQL JOINs, you're not affected.

In Short

Beam SQL JOIN operation does not work well with multiple trigger firings.

More Detail

Beam SQL JOIN is a CoGBK under the hood. It joins available elements
per-pane. This means that:

   - in discarding mode we're joining only new elements which arrived since
   last trigger firing, forgetting about any past elements;
   - in accumulating mode in addition to joining new elements we're also
   emitting join results we have already emitted when trigger fired last time;

This behavior cannot be configured or handled in pure SQL.

Even More Detail

Here


Proposal

Short term, allow only trigger configuration which we can reason about,
rejecting anything else.

For example we know that non-global windows with default trigger with zero
allowed lateness will fire once per window. In this case we will join all
elements in the window once after watermark is past end of window.

Long term, retractions will allow correct handling of this situation.

Jira , Pull Request


Regards,
Anton