[
https://issues.apache.org/jira/browse/FLINK-38817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18046113#comment-18046113
]
Zhu Zhu commented on FLINK-38817:
---------------------------------
Hi [~bvarghese], I'd like to understand how to reproduce the described problem.
A ForwardForUnspecifiedPartitioner will be converted into a rescale partitioner
only when its upstream and downstream operators are not chained.
However, for queries containing `ODER BY ... LIMIT N`, the parallelism of the
global operator SortLimit and its downstream operators(e.g. Calc and Sink)
should be chained. This means they will form one vertex `SortLimit->Calc->Sink`
and the rescale converting will not happen.
For example, I tried the query below:
```
SELECT t.id, t.name, t.age
FROM (VALUES(1, 'Tom', 20), (2, 'Dick', 25), (3, 'Harry', 40), (4,
'Ermintrude', 30)) t(id, name, age)
ORDER BY age LIMIT 3;
```
The result is correct and the job topology is as below:
!screenshot-1.png!
> Out of order data seen while running tpc-ds queries
> ---------------------------------------------------
>
> Key: FLINK-38817
> URL: https://issues.apache.org/jira/browse/FLINK-38817
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 2.2.0
> Reporter: Bonnie Varghese
> Priority: Major
> Attachments: screenshot-1.png
>
>
> All unspecified edges are converted to Rescale edges by default for dynamic
> graphs. Related Jira - https://issues.apache.org/jira/browse/FLINK-25046
> While testing tpc-ds queries I observed that after a global operation the
> order of the global operation is not preserved due to Rescale edges.
> For SQL batch to work correctly, we should keep Forward edges after a global
> operation such as `SortLimit` or `Sort `to obtain data correctness and
> avoiding out of order data.
> I have put my observations and experiments in this doc here:
> [https://docs.google.com/document/d/1TTj2ddlQTfDgtGb0ISmiKWt6R9U4RxJ59o6bULC1YtI/edit?usp=sharing]
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)