[
https://issues.apache.org/jira/browse/FLINK-22201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318974#comment-17318974
]
Kurt Young commented on FLINK-22201:
------------------------------------
[~jamii] Thanks for the reporting. Could you provide some example data that can
help us finding the bug? The query is just too simple that I can't recall any
potential bug around it.
> Incorrect output for simple sql query
> -------------------------------------
>
> Key: FLINK-22201
> URL: https://issues.apache.org/jira/browse/FLINK-22201
> Project: Flink
> Issue Type: Bug
> Components: Table SQL / API
> Affects Versions: 1.12.2
> Environment: {code:bash}
> [nix-shell:~/streaming-consistency/flink]$ java -version
> openjdk version "1.8.0_265"
> OpenJDK Runtime Environment (build 1.8.0_265-ga)
> OpenJDK 64-Bit Server VM (build 25.265-bga, mixed mode)
> [nix-shell:~/streaming-consistency/flink]$ flink --version
> Version: 1.12.2, Commit ID: 4dedee0
> [nix-shell:~/streaming-consistency/flink]$ nix-info
> system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 2.3.10,
> channels(jamie): "", channels(root): "nixos-20.09.3554.f8929dce13e", nixpkgs:
> /nix/var/nix/profiles/per-user/root/channels/nixos
> {code}
> Reporter: Jamie Brandon
> Priority: Major
>
> I'm running this simple query:
> {code:sql}
> CREATE VIEW credits AS
> SELECT
> to_account AS account,
> sum(amount) AS credits
> FROM
> transactions
> GROUP BY
> to_account;
> CREATE VIEW debits AS
> SELECT
> from_account AS account,
> sum(amount) AS debits
> FROM
> transactions
> GROUP BY
> from_account;
> CREATE VIEW balance AS
> SELECT
> credits.account AS account,
> credits - debits AS balance
> FROM
> credits,
> debits
> WHERE
> credits.account = debits.account;
> CREATE VIEW total AS
> SELECT
> sum(balance)
> FROM
> balance;
> {code}
> The `total` view is a sanity check - it's value should always be 0 because
> money is only moved from one account to another, never created or destroyed.
> In streaming mode (code
> [here|https://github.com/jamii/streaming-consistency/tree/a0f3b9d7ba178a7e184e6cb60e597a302dc3dd86/flink-table])
> only about ~0.04% of the output values are 0. The absolute error in the
> outputs increases roughly linearly wrt to the number of input transactions.
> But after the inputs are finished it does return to 0.
> In batch mode (code
> [here|https://github.com/jamii/streaming-consistency/tree/d3288e27649174c7463829c726be514610bbd056/flink])
> it produces 0 for a while but then has large jumps to incorrect outputs and
> never returns to 0. In this run, the first ~44% of the outputs are correct
> but the final answer is -48811 which amounts to miscounting ~5% of the inputs.
> I also run a variant of that query which joins on event time. In streaming
> mode it produces similar results to the original. In batch mode only 2 out of
> 1718375 outputs were correct and the final error was similar to the original
> query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)