[ 
https://issues.apache.org/jira/browse/FLINK-22201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319020#comment-17319020
 ] 

Kurt Young commented on FLINK-22201:
------------------------------------

[~jamii] You didn't really enabled batch execution mode because you hard coded 
table environment with streaming mode in 

[https://github.com/jamii/streaming-consistency/blob/d3288e27649174c7463829c726be514610bbd056/flink/src/main/java/Demo.java#L22]

> Incorrect output for simple sql query
> -------------------------------------
>
>                 Key: FLINK-22201
>                 URL: https://issues.apache.org/jira/browse/FLINK-22201
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>    Affects Versions: 1.12.2
>         Environment: {code:bash}
> [nix-shell:~/streaming-consistency/flink]$ java -version
> openjdk version "1.8.0_265"
> OpenJDK Runtime Environment (build 1.8.0_265-ga)
> OpenJDK 64-Bit Server VM (build 25.265-bga, mixed mode)
> [nix-shell:~/streaming-consistency/flink]$ flink --version
> Version: 1.12.2, Commit ID: 4dedee0
> [nix-shell:~/streaming-consistency/flink]$ nix-info
> system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 2.3.10, 
> channels(jamie): "", channels(root): "nixos-20.09.3554.f8929dce13e", nixpkgs: 
> /nix/var/nix/profiles/per-user/root/channels/nixos
> {code}
>            Reporter: Jamie Brandon
>            Priority: Major
>
> I'm running this simple query:
> {code:sql}
> CREATE VIEW credits AS
> SELECT
>     to_account AS account, 
>     sum(amount) AS credits
> FROM
>     transactions
> GROUP BY
>     to_account;
> CREATE VIEW debits AS
> SELECT
>     from_account AS account, 
>     sum(amount) AS debits
> FROM
>     transactions
> GROUP BY
>     from_account;
> CREATE VIEW balance AS
> SELECT
>     credits.account AS account, 
>     credits - debits AS balance
> FROM
>     credits,
>     debits
> WHERE
>     credits.account = debits.account;
> CREATE VIEW total AS
> SELECT
>     sum(balance)
> FROM
>     balance;
> {code}
> The `total` view is a sanity check - it's value should always be 0 because 
> money is only moved from one account to another, never created or destroyed.
> In streaming mode (code 
> [here|https://github.com/jamii/streaming-consistency/tree/a0f3b9d7ba178a7e184e6cb60e597a302dc3dd86/flink-table])
>  only about ~0.04% of the output values are 0. The absolute error in the 
> outputs increases roughly linearly wrt to the number of input transactions. 
> But after the inputs are finished it does return to 0.
> In batch mode (code 
> [here|https://github.com/jamii/streaming-consistency/tree/d3288e27649174c7463829c726be514610bbd056/flink])
>  it produces 0 for a while but then has large jumps to incorrect outputs and 
> never returns to 0. In this run, the first ~44% of the outputs are correct 
> but the final answer is -48811 which amounts to miscounting ~5% of the inputs.
> I also run a variant of that query which joins on event time. In streaming 
> mode it produces similar results to the original. In batch mode only 2 out of 
> 1718375 outputs were correct and the final error was similar to the original 
> query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to