I believe keeping ordering and partitioning as separate traits gives more
flexibility.  Combining them might preclude certain types of plans.  For
instance, in many systems the assumption is any type of distribution
destroys sortedness of the data, so a re-sort is needed after distribution
(i.e just doing a merge is not enough, although Drill does actually
preserve sortedness, so it does a merge).   Without knowing what the
combined trait would look like, I have a feeling that it will be
constraining for certain plans.

Separately, I think the optimizer should allow for adding new traits..for
instance compression.  Input streams may be hash/roundrobin distributed
and/or ordered and/or compressed.

Aman

On Mon, Jan 19, 2015 at 12:55 PM, Julian Hyde <[email protected]> wrote:

> We have discussed before whether ordering and partitioning should be
> distinct traits or the same trait. I was (still am) ambivalent about it.
> I’ve been having some discussions with the Hive team, and it looks as if
> they will make ordering & partitioning the same trait.
>
> Julian
>
> On Jan 19, 2015, at 9:51 AM, Jinfeng Ni <[email protected]> wrote:
>
> > For the case of "partition by x sort by y", I think planner currently
> keeps
> > the partition / sort in separate trait;  "partition by x" as a
> distribution
> > trait, "sort by y" as a collation.  Distribution trait has higher
> priority
> > than the sort collation. Drill's physical operators will have both those
> > traits, when doing planning work.
> >
> >
> > On Sun, Jan 18, 2015 at 9:33 PM, Jacques Nadeau <[email protected]>
> wrote:
> >
> >> In planning we currently state collation as total ordering. In some
> cases
> >> it would be useful to create a concept of local ordering. For example,
> >> partition by x then sort by y.  Does anyone have any thoughts on how we
> >> should define this in terms of traits/physical properties? The syntax
> would
> >> realistically only apply to ctas or as a description of existing files
> so I
> >> think we shouldn't need to enhance the language beyond those locations.
> >>
> >> J
> >>
>
>

Reply via email to