[ 
https://issues.apache.org/jira/browse/FLINK-22075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319078#comment-17319078
 ] 

lincoln lee commented on FLINK-22075:
-------------------------------------

The watermark is applied to both side of the join here because it's a 
selfjoin(the two inputs have same watermark).  If you're interested in the 
details of the Interval join, the source code of 
`org.apache.flink.table.runtime.operators.join.interval.TimeIntervalJoin` maybe 
more clearly than existing docs(I can't find more besides the two links above)

btw, it should be better and easier to use if more detailed documentation about 
the interval join is added to the website.

> Incorrect null outputs in left join
> -----------------------------------
>
>                 Key: FLINK-22075
>                 URL: https://issues.apache.org/jira/browse/FLINK-22075
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>    Affects Versions: 1.12.2
>         Environment: 
> https://github.com/jamii/streaming-consistency/blob/4e5d144dacf85e512bdc7afd77d031b5974d733e/pkgs.nix#L25-L46
> ```
> [nix-shell:~/streaming-consistency/flink]$ java -version
> openjdk version "1.8.0_265"
> OpenJDK Runtime Environment (build 1.8.0_265-ga)
> OpenJDK 64-Bit Server VM (build 25.265-bga, mixed mode)
> [nix-shell:~/streaming-consistency/flink]$ flink --version
> Version: 1.12.2, Commit ID: 4dedee0
> [nix-shell:~/streaming-consistency/flink]$ nix-info
> system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 2.3.10, 
> channels(jamie): "", channels(root): "nixos-20.09.3554.f8929dce13e", nixpkgs: 
> /nix/var/nix/profiles/per-user/root/channels/nixos
> ```
>            Reporter: Jamie Brandon
>            Assignee: lincoln lee
>            Priority: Critical
>             Fix For: 1.13.0
>
>
> I'm left joining a table with itself 
> [here](https://github.com/jamii/streaming-consistency/blob/4e5d144dacf85e512bdc7afd77d031b5974d733e/flink/src/main/java/Demo.java#L55-L66).
>  The output should have no nulls, or at least emit nulls and then retract 
> them. Instead I see:
> ```
> jamie@machine:~/streaming-consistency/flink$ wc -l tmp/outer_join_with_time
> 100000 tmp/outer_join_with_time
> jamie@machine:~/streaming-consistency/flink$ grep -c insert 
> tmp/outer_join_with_time
> 100000
> jamie@machine:~/streaming-consistency/flink$ grep -c 'null' 
> tmp/outer_join_with_time
> 16943
> ```
> ~17% of the outputs are incorrect and never retracted.
> [Full 
> output](https://gist.githubusercontent.com/jamii/983fee41609b1425fe7fa59d3249b249/raw/069b9dcd4faf9f6113114381bc7028c6642ca787/gistfile1.txt)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to