[
https://issues.apache.org/jira/browse/CALCITE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734670#comment-16734670
]
Julian Hyde commented on CALCITE-2648:
--------------------------------------
Don't set the distribution trait. It only relates to distributed execution
frameworks (e.g. Hadoop and Spark) where there are multiple instances of each
operator, each processing one slice of the input.
I like the idea of exploiting order. Either use fact that the input is already
sorted, or add a Sort. (In Volcano it amounts to the same thing: you ask for a
RelSubset with the desired sort order, and it may or may not have higher cost
than the current best.)
How about making a sub-class of EnumerableWindow that exploits the order of the
input? The code generated would be significantly different, because there is no
need to buffer rows. And the cost function would a significantly different.
If there are multiple windows, one which requires ORDER BY x and another that
requires ORDER BY y, then the typical plan would be scanĀ → sort →
window → sort → window. (In CALCITE-2764, [~vlsi] and I discussed
relational expressions that are sorted by X and Y at the same time, but I
maintain that this only occurs trivial relations, e.g. VALUES with 1 record,
and therefore is not useful in practice.)
> Output collation of EnumerableWindow is not consistent with its implementation
> ------------------------------------------------------------------------------
>
> Key: CALCITE-2648
> URL: https://issues.apache.org/jira/browse/CALCITE-2648
> Project: Calcite
> Issue Type: Bug
> Affects Versions: 1.17.0
> Reporter: Hongze Zhang
> Assignee: Julian Hyde
> Priority: Major
>
> Here is a case:
> {code:sql}
> select x, COUNT(*) OVER (PARTITION BY x) from (values (20), (35)) as t(x)
> ORDER BY x
> {code}
> Final plan:
> {code:java}
> EnumerableWindow(window#0=[window(partition {0} order by [] range between
> UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT()])])
> EnumerableValues(tuples=[[{ 20 }, { 35 }]])
> {code}
> Output rows:
> {code:java}
> X |EXPR$1 |
> ---|-------|
> 35 |1 |
> 20 |1 |
> {code}
> EnumerableWindow is supposed to preserve input collations, as a result
> EnumerableSort is ignored. However the implementation of EnumerableWindow
> generates non-ordered output (when PARTITION BY clause is used).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)