[ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227074#comment-16227074
 ] 

Paul Rogers commented on DRILL-5822:
------------------------------------

The general rule for the SQL project clause is the following:

* If the list is explicit, `SELECT b, c, a` then columns are returned in that 
order, even if the table defines them in the order (a, b, c).
* If the lis is implicit using a wildcard, `SELECT *`, then the column order is 
that defined by the table. In our example above, the order would be `a, b, c`.

Since Drill is distributed and schema-on-read, we run into the issue that two 
tables might have the same columns, but defined in different orders. For 
example, `{"a": 10, "b": 20, "c": 30}` and `{"c": 40, "b": 50, "c": 60}`. In 
this case, there is no "correct" order. Instead, Drill must:

1. Recognize that the above scenario can occur.
2. Define each merging operator to follow some reconciliation rule.

Here a "merging" operator is anything that can see batches from two distinct 
scans. That is, almost all operators, but at least the receivers.

A good reconciliation rule is that the first schema wins, and all other batches 
are projected into that first schema. In our example, `a, b, c` and `c, b, a` 
are both projected into `a, b, c`.

The PMC has asked that we not discuss design issues in PR reviews. So, can you 
perhaps please explain here the approach that this PR takes to solve the 
problem? Do we agree on the description above? Or, did this PR take a different 
approach?

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-5822
>                 URL: https://issues.apache.org/jira/browse/DRILL-5822
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.11.0
>            Reporter: Prasad Nagaraj Subramanya
>            Assignee: Vitalii Diravka
>             Fix For: 1.12.0
>
>
> Columns ordering doesn't preserve for the star query with sorting when this 
> is planned into multiple fragments.
> Repro steps:
> 1) {code}alter session set `planner.slice_target`=1;{code}
> 2) ORDER BY clause in the query.
> Scenarios:
> {code}
> 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
> +-------+--------------------------------+
> |  ok   |            summary             |
> +-------+--------------------------------+
> | true  | planner.slice_target updated.  |
> +-------+--------------------------------+
> 1 row selected (0.082 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--------------+----------+--------------+------------------------------------------------------+
> | n_nationkey  |  n_name  | n_regionkey  |                      n_comment     
>                   |
> +--------------+----------+--------------+------------------------------------------------------+
> | 0            | ALGERIA  | 0            |  haggle. carefully final deposits 
> detect slyly agai  |
> +--------------+----------+--------------+------------------------------------------------------+
> 1 row selected (0.141 seconds)
> 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
> +-------+--------------------------------+
> |  ok   |            summary             |
> +-------+--------------------------------+
> | true  | planner.slice_target updated.  |
> +-------+--------------------------------+
> 1 row selected (0.091 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +------------------------------------------------------+----------+--------------+--------------+
> |                      n_comment                       |  n_name  | 
> n_nationkey  | n_regionkey  |
> +------------------------------------------------------+----------+--------------+--------------+
> |  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0         
>    | 0            |
> +------------------------------------------------------+----------+--------------+--------------+
> 1 row selected (0.201 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to