[ 
https://issues.apache.org/jira/browse/FLINK-22201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17319034#comment-17319034
 ] 

Jark Wu commented on FLINK-22201:
---------------------------------

I think this behavior is expected, because the total transaction amout is 
increasing. Because the records of credits and debits are independent records, 
so the streaming output will either minus or plus the total transaction amout 
first. If the input stream stop, the final result should be 0. 

> Incorrect output for simple sql query
> -------------------------------------
>
>                 Key: FLINK-22201
>                 URL: https://issues.apache.org/jira/browse/FLINK-22201
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>    Affects Versions: 1.12.2
>         Environment: {code:bash}
> [nix-shell:~/streaming-consistency/flink]$ java -version
> openjdk version "1.8.0_265"
> OpenJDK Runtime Environment (build 1.8.0_265-ga)
> OpenJDK 64-Bit Server VM (build 25.265-bga, mixed mode)
> [nix-shell:~/streaming-consistency/flink]$ flink --version
> Version: 1.12.2, Commit ID: 4dedee0
> [nix-shell:~/streaming-consistency/flink]$ nix-info
> system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 2.3.10, 
> channels(jamie): "", channels(root): "nixos-20.09.3554.f8929dce13e", nixpkgs: 
> /nix/var/nix/profiles/per-user/root/channels/nixos
> {code}
>            Reporter: Jamie Brandon
>            Priority: Major
>         Attachments: config.toml, flink-total-timeseries.png
>
>
> I'm running this simple query:
> {code:sql}
> CREATE VIEW credits AS
> SELECT
>     to_account AS account, 
>     sum(amount) AS credits
> FROM
>     transactions
> GROUP BY
>     to_account;
> CREATE VIEW debits AS
> SELECT
>     from_account AS account, 
>     sum(amount) AS debits
> FROM
>     transactions
> GROUP BY
>     from_account;
> CREATE VIEW balance AS
> SELECT
>     credits.account AS account, 
>     credits - debits AS balance
> FROM
>     credits,
>     debits
> WHERE
>     credits.account = debits.account;
> CREATE VIEW total AS
> SELECT
>     sum(balance)
> FROM
>     balance;
> {code}
> The `total` view is a sanity check - it's value should always be 0 because 
> money is only moved from one account to another, never created or destroyed.
> In streaming mode (code 
> [here|https://github.com/jamii/streaming-consistency/tree/a0f3b9d7ba178a7e184e6cb60e597a302dc3dd86/flink-table])
>  only about ~0.04% of the output values are 0. The absolute error in the 
> outputs increases roughly linearly wrt to the number of input transactions. 
> But after the inputs are finished it does return to 0.
> In batch mode (code 
> [here|https://github.com/jamii/streaming-consistency/tree/d3288e27649174c7463829c726be514610bbd056/flink])
>  it produces 0 for a while but then has large jumps to incorrect outputs and 
> never returns to 0. In this run, the first ~44% of the outputs are correct 
> but the final answer is -48811 which amounts to miscounting ~5% of the inputs.
> I also run a variant of that query which joins on event time. In streaming 
> mode it produces similar results to the original. In batch mode only 2 out of 
> 1718375 outputs were correct and the final error was similar to the original 
> query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to