Right. In case ORDER BY is used with LIMIT in the subquery, we could not drop ORDER BY.
On Fri, Jul 31, 2015 at 9:22 AM, Aman Sinha <[email protected]> wrote: > Yes, in general collation is a better fit as a physical property rather > than logical property of a plan node. With regard to places where it makes > sense to treat it as logical property, agree with the ORDER-BY comments and > these should be extended to window functions too: > SELECT b, RANK() OVER (ORDER BY b) FROM table; > I would think the LogicalWindow should have collation on b. > > Jinfeng, the subquery's ORDER-BY can be dropped in some cases but not all.. > for instance in the following query: > SELECT a1 FROM (SELECT a1 FROM t1 WHERE .... ORDER BY a1) LIMIT 10; > The OB should not be dropped. There are other cases, this is one example. > > Aman > > On Fri, Jul 31, 2015 at 9:09 AM, Jinfeng Ni <[email protected]> wrote: > > > I think it makes sense that LogicalAggregate does not have collation, > since > > a LogicalAggregate could be implemented with different physical operator, > > either hash-based aggregation, or sort-based aggregation. Only when > > LogicalAggregate is converted into physical aggregator, it makes sense > to > > have collation, depending on the which physical operator is used. > > > > Same thing could be applied to LogicalJoin, which could be implemented > > either as hash-join, or sort-based join. > > > > At logical level, the only collation will come from the top level ORDER > BY > > clause. In that sense, I feel that the ORDER BY clause in a SUBQUERY, or > > VIEW probably should be removed in logical planning, since semantically > it > > does not impact query result. > > > > SELECT S.C1, T2.C4 > > FROM (SELECT C1, C2, C3 > > FROM T1 ORDER BY C1) AS S JOIN > > T2 > > ON S ... > > ORDER BY T2.C4; > > > > In Drill, we separate logical planning from physical planning, where the > > collation (together with distribution trait) will matter in physical > > planing. > > > > > > > > > > On Fri, Jul 31, 2015 at 7:27 AM, Milinda Pathirage < > [email protected]> > > wrote: > > > > > Thanks Julian for looking in to this. Thanks Maryann for detecting the > > > issue in CALCITE-783 patch. > > > > > > As I understand we only need input's (input to aggregate) order related > > > metadata at the level of aggregate. I think I was wrong saying that > > > LogicalAggregate discards collation metadata from input in CALCITE-784 > > > given that input is accessible from LogicalAggregate. We will only need > > to > > > do some calculations on input's collation metadata (or something > similar) > > > if we need to infer something about LogicalAggregate to be use by > > operators > > > which take aggregate as an input. > > > > > > Thanks > > > Milinda > > > > > > On Thu, Jul 30, 2015 at 11:32 PM, Maryann Xue <[email protected]> > > > wrote: > > > > > > > Thanks Julian for taking time to sort out all these requirements and > > > > rethink about the model! > > > > Thank you Milinda! Really appreciate your quick response to the > issue. > > > > > > > > On Thu, Jul 30, 2015 at 4:57 PM, Julian Hyde <[email protected]> > wrote: > > > > > > > >> There are a few issues in play regarding collations (783, 784, 793; > > see > > > >> links below) and they seem to be overlapping. Maryann and Milinda > have > > > been > > > >> at odds with each other (in the politest possible way!) > > > >> > > > >> The cause is that they are both doing very interesting new work > using > > > >> collation: > > > >> * Maryann is optimizing Phoenix plans to use secondary indexes. > These > > > are > > > >> tables that are project-sort materializations of a base table, > itself > > > >> sorted. > > > >> * Milinda is planning Samza streaming-aggregation queries. A plan > can > > > >> only be found if you know that the stream is sorted on one of the > > > >> aggregation keys, usually a time column. > > > >> > > > >> I spoke with Maryann about this today. I think that logical plans > > should > > > >> not have a sort order: > > > >> * In 783 and 784, I think I was wrong to allow logical RelNodes > > > >> (LogicalProject and LogicalAggregate) to have collations. Because > they > > > are > > > >> logical, they are inherently un-sorted. (But they may be based on a > > > table, > > > >> say an ArrayTable, that does have a sort order.) > > > >> * In 793, Maryann was right so say that we should not bake in the > > > >> collation that a plan *happens to have* when the SQL is first > > > translated, > > > >> because trying to find a physical plan with the same collation > > restricts > > > >> our options. > > > >> > > > >> But SQL ASTs should have a sort order (if the top node is an ORDER > BY > > > >> clause, or if a table referenced in the FROM clause is a stream) and > > > >> physical RelNodes should also have a sort order. > > > >> > > > >> And Milinda’s logical plans need a concept similar to sorting. > Maybe a > > > >> piece of metadata that this RelNode *could be sorted by X, Y if > > > desired*. > > > >> Any table can, of course, be re-sorted into any order you like, but > a > > > >> stream, which is infinite, can only be re-sorted to an order that > does > > > not > > > >> conflict with the order of the incoming data. > > > >> > > > >> I still need to roll up my sleeves and help these patient developers > > > >> (especially Milinda) get something working, but I hope it helps to > > have > > > a > > > >> general direction. > > > >> > > > >> Julian > > > >> > > > >> * https://issues.apache.org/jira/browse/CALCITE-783 Infer collation > > of > > > >> Project using monotonicity > > > >> * https://issues.apache.org/jira/browse/CALCITE-784 > > LogicalAggregate's > > > >> create method discards any collation traits from input > > > >> * https://issues.apache.org/jira/browse/CALCITE-793 The compiler > asks > > > >> for unnecessary collation trait on plan with materialized view > > > >> * https://issues.apache.org/jira/browse/CALCITE-825 Allow user to > > > >> specify sort order of an ArrayTable > > > >> > > > > > > > > > > > > > > > > > -- > > > Milinda Pathirage > > > > > > PhD Student | Research Assistant > > > School of Informatics and Computing | Data to Insight Center > > > Indiana University > > > > > > twitter: milindalakmal > > > skype: milinda.pathirage > > > blog: http://milinda.pathirage.org > > > > > >
