[ 
https://issues.apache.org/jira/browse/IMPALA-7844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16725267#comment-16725267
 ] 

ASF subversion and git services commented on IMPALA-7844:
---------------------------------------------------------

Commit d51aa522dd0e636d0de5a8d423cd3b689bdd39cb in impala's branch 
refs/heads/master from Paul Rogers
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=d51aa52 ]

IMPALA-7844: HAVING clause cannot support ordinals

The SELECT statement has two clauses that take lists of columns and/or
aliases: ORDER BY and GROUP BY. Each element is a alias, a table.column
reference or a number, which represents the ordinal number of a column.
Thus, "GROUP BY a, 2, c" is unambiguous.

SELECT also has a number of predicate clauses: WHERE and HAVING. In
these, aliases are possible (though seldom suppored), but ordinals are
ambiguous: is "WHERE 1 = 2" a reference to two constants, two columns by
ordinal, or a combination? No SQL dialect supports ordinals in WHERE or
HAVING for this reason.

Impala seems to have adopted a rather odd convention: if the HAVING
predicate has only one element (no operators), then that one element can
be an ordinal or alias. Thus "HAVING a" and "HAVING 1" are valid, only
if alias a or the column at ordinal 1 are BOOLEAN. However,
"HAVING a = 1" and "HAVING 1 = 2" are not valid.  This is unusual
because HAVING is normally a predicate: "HAVING a = 10".

This fix prepares to remove the attempt to support ordinals in the
HAVING clause, after which Impala will treat HAVING like WHERE, rather
than trying to treat it like ORDER BY or GROUP BY.

Review comments suggest that such a breaking change (even for such a
non-standard, undocumented, odd feature) should occur only in a major
version. So, for now, the no-ordinal support can be enabled via a
constant, but is turned off by default. (Perhaps a later patch can add
a session or runtime option instead of the constant.)

The fix retains the limited form of alias support.

Refactored the "ordinal or alias" code to make alias resolution
optional.

Reworded a few error messages for greater clarity.

Testing:

* Refactored AnalyzeStmtsTest to split apart the alias and ordinals
  cases for easier debugging.
* Tests can validate either the current HAVING ordinal behavior or
  the proposed new behavior. Test path is controlled by the constant
  described above.

Change-Id: Ic2b9f9e8c60fe2b25e20c57c2ffc31d8e59d5861
Reviewed-on: http://gerrit.cloudera.org:8080/11955
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Analysis code incorrectly attempts to support ordinals in HAVING clause
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-7844
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7844
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> SQL defines the idea of ordinals, which is, apparently, an old way to specify 
> columns in the {{ORDER BY}} and {{GROUP BY}} clauses:
> {code:sql}
> SELECT int_col, id
> FROM functional.alltypestiny
> GROUP BY 1
> ORDER BY 2
> {code}
> The use of an ordinal is semi-ambiguous (is it an ordinal or a literal), but 
> DBs (including Impala) usually interpret a single integer as an ordinal, but 
> interpret any expression as a constant. (For example, {{1}} is an ordinal, 
> but {{2 - 1}} is the constant value 1.)
> The use of ordinals works because {{ORDER BY}} and {{GROUP BY}} are lists: it 
> is clear when an integer stands alone as an ordinal.
> The {{HAVING}} (and {{WHERE}}) clauses are expressions. For this reason, DB's 
> do not support ordinals in these clauses. For example, what is the meaning 
> below:
> {code:sql}
> SELECT int_col, id
> FROM functional.alltypestiny
> WHERE 1 = 2
> {code}
> Does this mean that the first column equals 2? That the second column equals 
> 1? The first and second columns are equal? That the constant 1 equals the 
> constant 2?
> To avoid such ambiguity, neither the SQL standard nor any implementations 
> support ordinals in the {{HAVING}} (or {{WHERE}}) clauses.
> Yet, [Impala attempts to do 
> so|https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java#L549]:
> {code:java}
>         havingPred_ = substituteOrdinalOrAlias(havingClause_, "HAVING", 
> analyzer_);
> {code}
> This ticket proposes to remove this code to make it clear that the {{HAVING}} 
> clause cannot contain an ordinal.
> References:
> * [Redshift HAVING 
> clause|https://docs.aws.amazon.com/redshift/latest/dg/r_HAVING_clause.html]
> * [Impala 
> docs|https://impala.apache.org/docs/build3x/html/topics/impala_having.html] 
> are silent on this question.
> * [SQL Standard BNF for 
> HAVING|https://jakewheat.github.io/sql-overview/sql-2011-foundation-grammar.html#_7_10_having_clause]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to