[
https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15221567#comment-15221567
]
Dawid Wysakowicz commented on FLINK-2946:
-----------------------------------------
I still have some problems with range partitioning and parallelism.
* First of all the {{org.apache.flink.api.java.DataSet}} that I get from
{{translateToPlan}} does not have the method getParallelism. But that's a minor
issue.
* I am not sure how to extract the eventual parallelism of the input and if I
need to do this. Let's take this as example:
{code}
val env = ExecutionEnvironment.getExecutionEnvironment
env.setParallelism(1)
val t = env.fromElements((1, 3, "Third"), (1, 2, "Fourth"), (1, 4,
"Second"),
(2, 1, "Sixth"), (1, 5, "First"), (1, 1, "Fifth")).setParallelism(4)
.toTable.orderBy('_1.asc, '_2.desc)
{code}
The dataset then looks like(the numbers in brackets is parallelism of
operator): DataSource(4) -> MapOperator(-1) -> here I must apply either
SortOperator or PartitionOperator -> SortOperator.
On what parallelism shall I decide if the PartitionOperator should be applied?
What should be the parallelism of PartitionOperator?(By default it is the one
from ExecutionEnvironment)
Hope I stated my problems clearly.
> Add orderBy() to Table API
> --------------------------
>
> Key: FLINK-2946
> URL: https://issues.apache.org/jira/browse/FLINK-2946
> Project: Flink
> Issue Type: New Feature
> Components: Table API
> Reporter: Timo Walther
> Assignee: Dawid Wysakowicz
>
> In order to implement a FLINK-2099 prototype that uses the Table APIs code
> generation facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting
> feature are very welcome. Is there any more efficient way instead of
> {{.sortPartition(...).setParallism(1)}}? Is it better to sort locally on the
> nodes first and finally sort on one node afterwards?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)