[
https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210118#comment-15210118
]
Fabian Hueske commented on FLINK-2946:
--------------------------------------
Hi [~dawidwys], thanks a lot for working on this issue!
I had a look at your branch. You're definitely on the right track. Here are a
few comments:
- The Table API syntax looks good
- In {{Table.orderBy()}} you should not extract aggregations, etc. Instead
check that the expressions match the following patterns ({{Table.as()}} does
similar checks):
-- {{UnresolvedFieldReference}}
-- {{Asc(UnresolvedFieldReference)}}
-- {{Desc(UnresolvedFieldReference)}}
-- We can add support for more complex expressions and order by position later.
- Add asc() to {{RexNodeTranslator}}
- I just realized that Flink's range partitioning lacks support to define sort
orders for partition keys. We need to add this to make global sorting work
correctly. I added FLINK-3665 to address this issue.
- We do not need to range partition if the parallelism of the input is 1 (check
{{inputDs.getParallelism() == 1}})
I'll be out for vacation for about two weeks. Not sure if I can follow up on
this until I am back.
> Add orderBy() to Table API
> --------------------------
>
> Key: FLINK-2946
> URL: https://issues.apache.org/jira/browse/FLINK-2946
> Project: Flink
> Issue Type: New Feature
> Components: Table API
> Reporter: Timo Walther
> Assignee: Dawid Wysakowicz
>
> In order to implement a FLINK-2099 prototype that uses the Table APIs code
> generation facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting
> feature are very welcome. Is there any more efficient way instead of
> {{.sortPartition(...).setParallism(1)}}? Is it better to sort locally on the
> nodes first and finally sort on one node afterwards?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)