from:"Julian Hyde \(JIRA\)"

[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2021-06-20 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366262#comment-17366262
 ] 

Julian Hyde commented on HIVE-25173:


Cc [~cwensel].

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2021-06-20 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17366261#comment-17366261
 ] 

Julian Hyde commented on HIVE-25173:


It may be possible to bringing conjars back online for a few days, but keeping 
it permanently online is not an option. It costs tens of dollars per month to 
host.

If you want to build old versions of Calcite, installing the artifacts in a 
local Maven repo is an option.

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-25173) Fix build failure of hive-pre-upgrade due to missing dependency on pentaho-aggdesigner-algorithm

2021-06-15 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-25173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17363800#comment-17363800
 ] 

Julian Hyde commented on HIVE-25173:


I made a release of this library under my groupid on maven central. I don’t 
recall the coordinates but you can find them in Calcite (calcite depends on the 
new version). 

If conjars.org is at the root of this problem, let me know. I know the owner of 
that repo. He took it offline to find out who, if anyone, was using it. 

> Fix build failure of hive-pre-upgrade due to missing dependency on 
> pentaho-aggdesigner-algorithm
> 
>
> Key: HIVE-25173
> URL: https://issues.apache.org/jira/browse/HIVE-25173
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.2
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {noformat}
> [ERROR] Failed to execute goal on project hive-pre-upgrade: Could not resolve 
> dependencies for project org.apache.hive:hive-pre-upgrade:jar:4.0.0-SNAPSHOT: 
> Failure to find org.pentaho:pentaho-aggdesigner-algorithm:jar:5.1.5-jhyde in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24902) Incorrect result due to ReduceExpressionsRule

2021-03-18 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17304552#comment-17304552
 ] 

Julian Hyde commented on HIVE-24902:


Sorry, no.

> Incorrect result due to ReduceExpressionsRule
> -
>
> Key: HIVE-24902
> URL: https://issues.apache.org/jira/browse/HIVE-24902
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nemon Lou
>Priority: Major
>
> The following sql returns only one record (20210308)but we expect two(20210308
> 20210309).
> {code:sql}
> select * from (
> select 
>   case when b.a=1
>  then  
>   cast 
> (from_unixtime(unix_timestamp(cast(20210309 as string),'MMdd') - 
> 86400,'MMdd') as bigint)
> else 
> 20210309 
>  end 
> as col
> from 
> (select stack(2,1,2) as (a))
>  as b
> ) t 
> where t.col is not null;
> {code}
> After debuging, i find the ReduceExpressionsRule changes expression in the 
> wrong way.
> Original expression:
> {code:sql}
> IS NOT NULL(CASE(=($0, 1), 
> CAST(FROM_UNIXTIME(-(UNIX_TIMESTAMP(CAST(_UTF-16LE'20210309'):VARCHAR(2147483647)
>  CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary", 
> _UTF-16LE'MMdd'), CAST(86400):BIGINT), _UTF-16LE'MMdd')):BIGINT, 
> 20210309))
> {code}
> After reducing expressions:
> {code:sql}
> CASE(=($0, 1), IS NOT 
> NULL(CAST(FROM_UNIXTIME(-(UNIX_TIMESTAMP(CAST(_UTF-16LE'20210309'):VARCHAR(2147483647)
>  CHARACTER SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary", 
> _UTF-16LE'MMdd'), CAST(86400):BIGINT), _UTF-16LE'MMdd')):BIGINT), 
> true)
> {code}
> The query plan in main branch:
> {code:sql}
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
> Fetch Operator
>   limit: -1
>   Processor Tree:
> TableScan
>   alias: _dummy_table
>   Row Limit Per Split: 1
>   Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column 
> stats: COMPLETE
>   Select Operator
> expressions: 2 (type: int), 1 (type: int), 2 (type: int)
> outputColumnNames: _col0, _col1, _col2
> Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
> Column stats: COMPLETE
> UDTF Operator
>   Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   function name: stack
>   Filter Operator
> predicate: COALESCE((col0 = 1),false) (type: boolean)
> Statistics: Num rows: 1 Data size: 12 Basic stats: COMPLETE 
> Column stats: COMPLETE
> Select Operator
>   expressions: CASE WHEN ((col0 = 1)) THEN (20210308L) ELSE 
> (20210309L) END (type: bigint)
>   outputColumnNames: _col0
>   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE 
> Column stats: COMPLETE
>   ListSink
> Time taken: 0.155 seconds, Fetched: 28 row(s)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22455) Union branch removal rule does not kick in.

2019-11-04 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967155#comment-16967155
 ] 

Julian Hyde commented on HIVE-22455:


There are a bunch of rules in Calcite's class PruneEmptyRules that recognize 
empty relational expressions and simplify accordingly.

> Union branch removal rule does not kick in.
> ---
>
> Key: HIVE-22455
> URL: https://issues.apache.org/jira/browse/HIVE-22455
> Project: Hive
>  Issue Type: Improvement
>Reporter: Steve Carlin
>Priority: Major
>
> After the Calcite upgrade to 1.21, there is a rule where 2 branches of a 
> union have limit 0. This can be simplified.
> This can be found in: union_assertion_type.q.out



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("

2019-04-01 Thread Julian Hyde (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807152#comment-16807152
 ] 

Julian Hyde commented on HIVE-17395:


HIVE-17395 seems to be a duplicate of HIVE-18624. Do others agree?

> HiveServer2 parsing a command with a lot of "("
> ---
>
> Key: HIVE-17395
> URL: https://issues.apache.org/jira/browse/HIVE-17395
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2
>Affects Versions: 2.3.0
>Reporter: dan young
>Priority: Major
>
> Hello,
> We're seeing what appears to be the same issue that was outlined in 
> HIVE-15388 where the query parser spends a lot of time (never returns and I 
> need to kill the beeline process) parsing a command with a lot of "(" .   I 
> tried this in both 2.2 and now 2.3.
> Here's an example query (this is auto generated SQL BTW) in beeline that 
> never completes/parses, I end up just killing the beeline process.
> It looks like something similar was addressed as part of HIVE-15388.   Any 
> ideas on how to address this?  write better SQL? patch?
> Regards,
> Dano
> {noformat}
> Connected to: Apache Hive (version 2.3.0)
> Driver: Hive JDBC (version 2.3.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.3.0 by Apache Hive
> 0: jdbc:hive2://localhost:1/test_db> SELECT 
> ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP), 
> 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), 
> -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP), 
> 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11));
> When I did a jstack on the HiveServer2, it appears the be stuck/running in 
> the HiveParser/antlr.
> "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" 
> #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable 
> [0x7fb77d7bb000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502)
>   at org.antlr.runtime.DFA.predict(DFA.java:80)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6988)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceUnaryPrefixExpression(HiveParser_IdentifiersParser.java:7324)
>   at 
>

[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("

2019-03-26 Thread Julian Hyde (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802180#comment-16802180
 ] 

Julian Hyde commented on HIVE-17395:


[~kgyrtkirk] Thanks. I was aware of HIVE-15388 but it does look as if 
HIVE-18624 is a better match. This might be a duplicate. It does match the 
timescale when this issue appeared.

> HiveServer2 parsing a command with a lot of "("
> ---
>
> Key: HIVE-17395
> URL: https://issues.apache.org/jira/browse/HIVE-17395
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2
>Affects Versions: 2.3.0
>Reporter: dan young
>Priority: Major
>
> Hello,
> We're seeing what appears to be the same issue that was outlined in 
> HIVE-15388 where the query parser spends a lot of time (never returns and I 
> need to kill the beeline process) parsing a command with a lot of "(" .   I 
> tried this in both 2.2 and now 2.3.
> Here's an example query (this is auto generated SQL BTW) in beeline that 
> never completes/parses, I end up just killing the beeline process.
> It looks like something similar was addressed as part of HIVE-15388.   Any 
> ideas on how to address this?  write better SQL? patch?
> Regards,
> Dano
> {noformat}
> Connected to: Apache Hive (version 2.3.0)
> Driver: Hive JDBC (version 2.3.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.3.0 by Apache Hive
> 0: jdbc:hive2://localhost:1/test_db> SELECT 
> ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP), 
> 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), 
> -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP), 
> 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))),11));
> When I did a jstack on the HiveServer2, it appears the be stuck/running in 
> the HiveParser/antlr.
> "e62658bd-5ea9-43c4-898f-3048d913f192 HiveServer2-Handler-Pool: Thread-96" 
> #96 prio=5 os_prio=0 tid=0x7fb78c366000 nid=0x4476 runnable 
> [0x7fb77d7bb000]
>java.lang.Thread.State: RUNNABLE
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser$DFA36.specialStateTransition(HiveParser_IdentifiersParser.java:31502)
>   at org.antlr.runtime.DFA.predict(DFA.java:80)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.atomExpression(HiveParser_IdentifiersParser.java:6746)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceFieldExpression(HiveParser_IdentifiersParser.java:6988)
>   at 
>

[jira] [Commented] (HIVE-17395) HiveServer2 parsing a command with a lot of "("

2019-03-25 Thread Julian Hyde (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-17395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800973#comment-16800973
 ] 

Julian Hyde commented on HIVE-17395:


I'm running into this issue also, in my testing of Looker against Hive. It 
seems to be a regression from earlier versions of Hive. Looker generates deeply 
nested expressions, and so hits this problem hard; we are recommending that our 
customers do not upgrade to Hive 2.2, 2.3 or 3 because of this issue.

I am more of an expert on JavaCC than Antlr, but I agree with [~kgyrtkirk] that 
the problem seems to be lookaheads. The calls to 
{{org.antlr.runtime.DFA.predict}} on the stack are evidence of that. Each call 
to predict will be followed by a call to actually parse, so each call to 
predict doubles the running time. There are 12 calls, which would suggest a 
4096x slowdown.

I don't know whether an upgrade to antlr v4 is possible or planned. [A post on 
stackoverflow|https://stackoverflow.com/questions/17054285/is-it-possible-to-lookahead-in-antlr4-without-actually-matching-a-token]
 suggests that "=>" (the lookahead operator) is no longer necessary on antlr 
v4; antlr verifies lookahead as it parses. If true, that 2 ^ 12 number above 
would be become 1 ^ 12, a much nicer number!

> HiveServer2 parsing a command with a lot of "("
> ---
>
> Key: HIVE-17395
> URL: https://issues.apache.org/jira/browse/HIVE-17395
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, HiveServer2
>Affects Versions: 2.3.0
>Reporter: dan young
>Priority: Major
>
> Hello,
> We're seeing what appears to be the same issue that was outlined in 
> HIVE-15388 where the query parser spends a lot of time (never returns and I 
> need to kill the beeline process) parsing a command with a lot of "(" .   I 
> tried this in both 2.2 and now 2.3.
> Here's an example query (this is auto generated SQL BTW) in beeline that 
> never completes/parses, I end up just killing the beeline process.
> It looks like something similar was addressed as part of HIVE-15388.   Any 
> ideas on how to address this?  write better SQL? patch?
> Regards,
> Dano
> {noformat}
> Connected to: Apache Hive (version 2.3.0)
> Driver: Hive JDBC (version 2.3.0)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> Beeline version 2.3.0 by Apache Hive
> 0: jdbc:hive2://localhost:1/test_db> SELECT 
> ((UNIX_TIMESTAMP(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP), 
> 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP),11))), 'MM'))), 
> -3),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP(CONCAT(ADD_MONTHS(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS TIMESTAMP), 
> 1),SUBSTRING(CAST(CONCAT(CAST(YEAR(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 AS STRING), '-', 
> LPAD(CAST(((CAST(CEIL(MONTH(TIMESTAMP(CONCAT(ADD_MONTHS(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20
>  00:00:00.0'), 'MM'))), 
> -1),SUBSTRING(TIMESTAMP(DATE(TRUNC(TIMESTAMP('2012-04-20 
> 00:00:00.0'), 'MM'))),11 / 3) AS INT) - 1) * 3) + 1 AS STRING), 
> 2, '0'), '-01 00:00:00') AS

[jira] [Commented] (HIVE-16924) Support distinct in presence Gby

2018-02-27 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379190#comment-16379190
 ] 

Julian Hyde commented on HIVE-16924:


I added a new patch, HIVE-16924.06.patch, based upon 
https://github.com/julianhyde/hive/tree/16924-distinct-group-by-squashed. It 
has the same content as HIVE-16924.05.patch, except that I have rebased to 
latest master, squashed, and cleaned up (e.g. removed commented out code). Most 
tests are succeeding, but the 7 failures listed above probably still remain. 

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-27 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Attachment: HIVE-16924.06.patch

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch, 
> HIVE-16924.06.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-27 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Open  (was: Patch Available)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16924) Support distinct in presence Gby

2018-02-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366125#comment-16366125
 ] 

Julian Hyde commented on HIVE-16924:


As of the latest patch, many tests on [~ashutoshc]'s list have been fixed; the 
following remain:

{noformat}
TestMiniLlapLocalCliDriver.cbo_rp_unionDistinct_2.q
TestMiniLlapLocalCliDriver.cross_prod_1.q
TestMiniLlapLocalCliDriver.cross_prod_3.q
TestMiniLlapLocalCliDriver.cross_prod_4.q
TestMiniLlapLocalCliDriver.selectDistinctStar.q
TestNegativeCliDriver.selectDistinctStarNeg_2.q
TestNegativeCliDriver.udaf_invalid_place.q
{noformat}

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-14 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Patch Available  (was: Open)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-14 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Attachment: HIVE-16924.05.patch

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-14 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Open  (was: Patch Available)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch, HIVE-16924.05.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16924) Support distinct in presence Gby

2018-02-12 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361219#comment-16361219
 ] 

Julian Hyde commented on HIVE-16924:


[~ashutoshc], I have fixed various test failures, as of patch 4. There are 
still test failures/errors but I'm not sure any of them are due to the DISTINCT 
change. Can you please review the latest test output?

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-11 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Patch Available  (was: Open)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-11 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Attachment: HIVE-16924.04.patch

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch, HIVE-16924.04.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-11 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Open  (was: Patch Available)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-11 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Attachment: HIVE-16924.03.patch

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-11 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Patch Available  (was: Open)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch, 
> HIVE-16924.03.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-11 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Open  (was: Patch Available)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16924) Support distinct in presence Gby

2018-02-10 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359316#comment-16359316
 ] 

Julian Hyde commented on HIVE-16924:


Still running tests on my local machine. The only substantive test failure I've 
found so far is {{having2.q}} - an invalid query that Hive was correctly 
labeling invalid. The others seem to be stats changes.

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-16924) Support distinct in presence Gby

2018-02-05 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353242#comment-16353242
 ] 

Julian Hyde commented on HIVE-16924:


Test failures so far: wrong_distinct1.q, global_limit.q (statistics), 
llap_smb.q (statistics), mm_all.q, mm_cttas.q, unionDistinct_1.q.

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-05 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Status: Patch Available  (was: Open)

Patch 2, based upon 
[https://github.com/julianhyde/hive/tree/16924-distinct-group-by], fixes 
bit-rot in [~rusanu]'s patch 1. Let's see whether there are any test failures.

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-16924) Support distinct in presence Gby

2018-02-05 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-16924:
---
Attachment: HIVE-16924.02.patch

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
>Priority: Major
> Attachments: HIVE-16924.01.patch, HIVE-16924.02.patch
>
>
> {code:sql}
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> {code}
> These queries should work:
> {code:sql}
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> {code}
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-18423) Hive should support usage of external tables using jdbc

2018-01-18 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-18423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331159#comment-16331159
 ] 

Julian Hyde commented on HIVE-18423:


I'd like to comment on an raised in the review comments by [~msydoron] and 
[~jcamachorodriguez] about release numbers.
* The PR into Calcite was based on branch-1.15, but we do not generally make 
branch releases in Calcite, so the PR will end up in master branch, whose 
version will be 1.16.0-SNAPSHOT and then 1.16.0 upon release.
* Hive's master branch never depends upon snapshot versions of other 
components. Thus your PR into Hive will only be committed to master branch 
after calcite-1.16.0 has been released.
* In git we usually work in feature branches, often named after a particular 
jira case. In such branches you are of course at liberty to refer to snapshot 
versions of other components. Those components may or may not be in the Apache 
snapshot repository (I think Calcite snapshots are published on successful 
builds of the master branch, but I'm not sure) and if they're not, you can 
easily publish to your local repository (in ~/.m2) using 'mvn install'

> Hive should support usage of external tables using jdbc
> ---
>
> Key: HIVE-18423
> URL: https://issues.apache.org/jira/browse/HIVE-18423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Jonathan Doron
>Assignee: Jonathan Doron
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
>
> Hive should support the usage of external jdbc tables(and not only external 
> tables that hold queries), so an Hive user would be able to use the external 
> table as an hive internal table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-13094) CBO: Assertion error in Case expression

2017-11-01 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234326#comment-16234326
 ] 

Julian Hyde commented on HIVE-13094:


[~jcamachorodriguez], Did this ever get ported to Calcite? Looks as if 
CALCITE-1502 is similar.

> CBO: Assertion error  in Case expression
> 
>
> Key: HIVE-13094
> URL: https://issues.apache.org/jira/browse/HIVE-13094
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: HIVE-13094.01.patch, HIVE-13094.patch
>
>
> Triggered by a trap case in the case evaluation
> {code}
> CASE WHEN (-2) >= 0  THEN SUBSTRING(str0, 1,CAST((-2) AS INT)) ELSE NULL
> {code}
> {code}
> Exception in thread "b367ad08-d900-4672-8e75-a4e90a52141b 
> b367ad08-d900-4672-8e75-a4e90a52141b main" java.lang.AssertionError: Internal 
> error: Cannot add expression of different type to set:
> set type is RecordType(VARCHAR(2147483647) CHARACTER SET "ISO-8859-1" COLLATE 
> "ISO-8859-1$en_US$primary" $f0, VARCHAR(2147483647) CHARACTER SET 
> "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" $f1, VARCL
> expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "ISO-8859-1" 
> COLLATE "ISO-8859-1$en_US$primary" $f0, VARCHAR(2147483647) CHARACTER SET 
> "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary" $fL
> set is 
> rel#12408:HiveProject.HIVE.[](input=HepRelVertex#12407,$f0=$0,$f1=$6,$f2=CASE(>=(-(2),
>  0), substring($6, 1, -(2)), null))
> expression is HiveProject#12414
> at org.apache.calcite.util.Util.newInternal(Util.java:774)
> at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:317)
> at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
> at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:224)
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(HiveReduceExpressionsRule.java:208)
> at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:318)
> at 
> org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:514)
> at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:392)
> at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:285)
> at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:72)
> at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:207)
> at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:194)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:1265)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:1125)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:938)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:878)
> at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
> at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:969)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Assigned] (HIVE-16924) Support distinct in presence Gby

2017-08-29 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-16924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde reassigned HIVE-16924:
--

Assignee: Julian Hyde  (was: Remus Rusanu)

> Support distinct in presence Gby 
> -
>
> Key: HIVE-16924
> URL: https://issues.apache.org/jira/browse/HIVE-16924
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Planning
>Reporter: Carter Shanklin
>Assignee: Julian Hyde
> Attachments: HIVE-16924.01.patch
>
>
> create table e011_01 (c1 int, c2 smallint);
> insert into e011_01 values (1, 1), (2, 2);
> These queries should work:
> select distinct c1, count(*) from e011_01 group by c1;
> select distinct c1, avg(c2) from e011_01 group by c1;
> Currently, you get : 
> FAILED: SemanticException 1:52 SELECT DISTINCT and GROUP BY can not be in the 
> same query. Error encountered near token 'c1'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HIVE-12923) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_grouping_sets4.q failure

2017-06-08 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16043122#comment-16043122
 ] 

Julian Hyde commented on HIVE-12923:


I have created a pull request for CALCITE-1069. The idea is that if you want 
GROUPING or GROUPING__ID, you add it to Aggregate as an aggregate function. If 
you don't, you omit it. Either way, you get the same columns regardless of 
whether it is a simple GROUP BY, with one grouping set, or any other number of 
grouping sets. Please review.

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> groupby_grouping_sets4.q failure
> 
>
> Key: HIVE-12923
> URL: https://issues.apache.org/jira/browse/HIVE-12923
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function

2017-06-01 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033545#comment-16033545
 ] 

Julian Hyde commented on HIVE-15996:


I was mistaken about {{GROUP_ID()}} being equivalent to {{GROUPING(x, y, z)}} 
(assuming the query has {{GROUP BY x, y, z}}).
In fact, {{GROUP_ID()}} should return zero unless you have a query with 
duplicate grouping sets (which is very unusual).
Furthermore, {{GROUP_ID()}} is Oracle-specific and non-standard.
See CALCITE-1824 for more details.

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Fix For: 3.0.0
>
> Attachments: HIVE-15996.01.patch, HIVE-15996.02.patch, 
> HIVE-15996.03.patch, HIVE-15996.04.patch
>
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-12923) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_grouping_sets4.q failure

2017-03-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15926833#comment-15926833
 ] 

Julian Hyde commented on HIVE-12923:


[~hsubramaniyan], [~jcamachorodriguez], Any thoughts on this?

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> groupby_grouping_sets4.q failure
> 
>
> Key: HIVE-12923
> URL: https://issues.apache.org/jira/browse/HIVE-12923
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-12923) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_grouping_sets4.q failure

2017-02-22 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879135#comment-15879135
 ] 

Julian Hyde commented on HIVE-12923:


I'm thinking of an alternative solution to CALCITE-1069. Currently, as you 
know, an Aggregate with more than one grouping set returns more columns than 
one with only one grouping set. We have been arguing about whether there should 
be 1 extra column (Hive's preference) or N extra columns (Calcite's preference).

My new proposal is that there should be no extra columns. We make GROUPING into 
an aggregate function, and if you want those extra columns you can add calls to 
GROUPING.

If the row type of Aggregate is same regardless of the number of grouping sets, 
it will simplify a bunch of things. For example, it would be easier to write a 
rule that pushes down the Filter "group_id = 2", because we wouldn't have to 
worry about disappearing columns, and whether they are used.

[~hsubramaniyan], [~jcamachorodriguez], Would the new proposal be acceptable to 
Hive?

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver 
> groupby_grouping_sets4.q failure
> 
>
> Key: HIVE-12923
> URL: https://issues.apache.org/jira/browse/HIVE-12923
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
> at 
> org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
> at 
> org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function

2017-02-22 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15879040#comment-15879040
 ] 

Julian Hyde commented on HIVE-15996:


[~jcamachorodriguez] [~cartershanklin], Yes, I meant to say {{GROUPING_ID}}. 
The authors of the SQL standard were smart to extend GROUPING rather than 
create another function name, as Oracle did. The namespace (GROUPING_ID, 
GROUPING, GROUP_ID not to mention GROUP BY and GROUPING SETS) is over-full and 
confusing.

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15996.patch
>
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15996) Implement multiargument GROUPING function

2017-02-21 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876628#comment-15876628
 ] 

Julian Hyde commented on HIVE-15996:


Is this behavior identical to the GROUP_ID function?

> Implement multiargument GROUPING function
> -
>
> Key: HIVE-15996
> URL: https://issues.apache.org/jira/browse/HIVE-15996
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.2.0
>Reporter: Carter Shanklin
>Assignee: Jesus Camacho Rodriguez
>
> Per the SQL standard section 6.9:
> GROUPING ( CR1, ..., CRN-1, CRN )
> is equivalent to:
> CAST ( ( 2 * GROUPING ( CR1, ..., CRN-1 ) + GROUPING ( CRN ) ) AS IDT )
> So for example:
> select c1, c2, c3, grouping(c1, c2, c3) from e011_02 group by rollup(c1, c2, 
> c3);
> Should be allowed and equivalent to:
> select c1, c2, c3, 4*grouping(c1) + 2*grouping(c2) + grouping(c3) from 
> e011_02 group by rollup(c1, c2, c3);



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-5873) SubQuery: In subquery Count Bug

2017-02-13 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864552#comment-15864552
 ] 

Julian Hyde commented on HIVE-5873:
---

This is fixed in Calcite as part of CALCITE-365. It's worth re-evaluating when 
Hive picks up Calcite 1.12.

By the way, it's ambiguous in [~rhbutani]'s description above, but the query 
should return two rows (yes, including PNum=8). I confirmed on Postgres:

{noformat}
> create table Part (PNum int, OrderOnHand int);
CREATE TABLE
> insert into Part values (3,6),(10,1),(8,0);
INSERT 0 3
> create table Supply (PNum int, Qty int);
CREATE TABLE
> insert into Supply values (3,4),(3,2),(10,1);
INSERT 0 3
> select pnum  
from Part p
where orderOnHand
 in (select count(*) from Supply s
  where s.pnum = p.pnum
 );
 pnum 
--
   10
8
(2 rows)
{noformat}

> SubQuery: In subquery Count Bug
> ---
>
> Key: HIVE-5873
> URL: https://issues.apache.org/jira/browse/HIVE-5873
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Harish Butani
>
> This is from the Optimization of Nested SQl Queries Revisited paper: 
> http://dl.acm.org/citation.cfm?id=38723
> Consider Part table having:
> {noformat}
> PNum OrderOnHand
>  --
> 3  6
> 101
> 8  0
> {noformat}
> Supply table having:
> {noformat}
> PNum  Qty  
> 3  4
> 3  2
> 101
> {noformat}
> The query:
> {noformat}
> select pnum
> from parts p
> where orderOnHand
>  in (select count(*) from supply s
>   where s.pnum = p.pnum
>  )
> {noformat}
> should return the row with PNum=8.
> But a transformation to a semi-join would eliminate this row, as there are no 
> rows in supply table with PNum=8.
> AS shown in the paper the soln is to transform to:
> {noformat}
> select pnum
> from parts p semijoin
> (select p1.pnum, count(*) as c
>   from (select distinct pnum from parts) p1 join supply s
>   where s.pnum = p1.pnum
>  ) sq on p.pnum = sq.pnum and p.orderOnHand = sq.c
> {noformat}
> The additional distinct query within the SubQuery is to handle duplicates in 
> the outer query on the joining columns.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HIVE-15708) Upgrade calcite version to 1.11

2017-01-27 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-15708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843800#comment-15843800
 ] 

Julian Hyde commented on HIVE-15708:


[~elserj], Do you agree or disagree that a factor here was that 
ConnectionConfig depends on types that are not simple -- specifically 
Service.Factory and AvaticaHttpClientFactory. I think the config API should 
have minimal dependencies, because everyone depends on it, and we don't want to 
drag optional stuff like protobuf and kerberos into the whole of Avatica. (Mea 
culpa, I started it, by adding {{Service.Factory factory()}} a long time ago.) 
But in Calcite's sub-interface, CalciteConnectionConfig, for similar config 
methods, I use things like {{ T factory(Class factoryClass)}} and let the 
client specify the particular class.

> Upgrade calcite version to 1.11
> ---
>
> Key: HIVE-15708
> URL: https://issues.apache.org/jira/browse/HIVE-15708
> Project: Hive
>  Issue Type: Task
>  Components: CBO, Logical Optimizer
>Affects Versions: 2.2.0
>Reporter: Ashutosh Chauhan
>Assignee: Remus Rusanu
> Attachments: HIVE-15708.01.patch, HIVE-15708.02.patch, 
> HIVE-15708.03.patch, HIVE-15708.04.patch
>
>
> Currently we are on 1.10 Need to upgrade calcite version to 1.11



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11165) Calcite planner might have a thread-safety issue compiling in parallel

2016-05-03 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15269751#comment-15269751
 ] 

Julian Hyde commented on HIVE-11165:


I don't have an update. It's not obviously a thread-safety issue; the graph 
which is blowing up in that call stack is not shared between threads. More 
likely, the planner is firing rules over and over again until the graph of 
RelNodes gets really large. Thread-safety is one of several possible causes of 
that.

> Calcite planner might have a thread-safety issue compiling in parallel
> --
>
> Key: HIVE-11165
> URL: https://issues.apache.org/jira/browse/HIVE-11165
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 2.0.0
>Reporter: Gopal V
>Assignee: Jesus Camacho Rodriguez
> Attachments: RunJar-2015-06-30.snapshot
>
>
> After about 6 minutes trying to plan a query, the HiveServer2 was killed to 
> restore functionality to a test run.
> The HEP planner is stuck on a TopologicalOrder traversal and there were no 
> queries being fed into the HiveServer2 after it got stuck.
> TPC-DS query13 was the query in question, at 4 way parallel, which triggered 
> the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12839) Upgrade Hive to Calcite 1.6

2016-02-05 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133973#comment-15133973
 ] 

Julian Hyde commented on HIVE-12839:


It's not essential, but I recommend storing RelMetadataQuery.instance() in a 
variable if you are going to use it more than once in a method or rule 
invocation. We plan (as part of CALCITE-604) to cache results in the 
RelMetadataQuery instance, so if you keep the same instance you'll have to do 
less work. You should *definitely* use the same instance when one metadata 
method calls another.

Of course we'd like to cache results BETWEEN calls but I don't know how much we 
can safely cache when the graph is mutating.

Also, very minor, but I have a convention of naming RelMetadataQuery variables 
{{mq}}. They crop up a lot, so you don't want a long name, and a consistent 
short name helps you find one if it is around in the same method.

> Upgrade Hive to Calcite 1.6
> ---
>
> Key: HIVE-12839
> URL: https://issues.apache.org/jira/browse/HIVE-12839
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12839.01.patch, HIVE-12839.02.patch, 
> HIVE-12839.03.patch, HIVE-12839.04.patch
>
>
> CLEAR LIBRARY CACHE
> Upgrade Hive to Calcite 1.6.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12839) Upgrade Hive to Calcite 1.6

2016-02-02 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15128815#comment-15128815
 ] 

Julian Hyde commented on HIVE-12839:


[~pxiong], I was pretty sure I didn't add anything breaking after you validated 
the 1.6 RC, but it looks as if I screwed up. Sorry about that. I think you can 
add a HiveValuesFactory to the context used to create a RelBuilder, make it do 
something different for 0 rows (e.g. create a Sort(limit 0) or a null scan) and 
that will solve the problem without needing any changes to Calcite.

As an aside, if you could change Calcite, what would you want 
{{RelBuilder.empty}} to do? I presume it is useful to convert {{LIMIT 0}} and 
{{WHERE FALSE}} to an empty relation, so how would you propose to represent it 
in Hive?

> Upgrade Hive to Calcite 1.6
> ---
>
> Key: HIVE-12839
> URL: https://issues.apache.org/jira/browse/HIVE-12839
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-12839.01.patch, HIVE-12839.02.patch, 
> HIVE-12839.03.patch
>
>
> CLEAR LIBRARY CACHE
> Upgrade Hive to Calcite 1.6.0-incubating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-12640) Allow StatsOptimizer to optimize the query for Constant GroupBy keys

2015-12-10 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15051586#comment-15051586
 ] 

Julian Hyde commented on HIVE-12640:


If {{src}} is empty, according to the SQL standard, should {code} select 
count('1') from src group by '1'{code} and {code} select count('1') from 
src{code} return the same result? My understanding is that the first should 
return 1 row, the second 0 rows.

> Allow StatsOptimizer to optimize the query for Constant GroupBy keys 
> -
>
> Key: HIVE-12640
> URL: https://issues.apache.org/jira/browse/HIVE-12640
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-12640.1.patch
>
>
> {code}
> hive> select count('1') from src group by '1';
> {code}
> In the above query, while performing StatsOptimizer optimization we can 
> safely ignore the group by on the constant key '1' since the above query will 
> return the same result as "select count('1') from src".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11918) Implement/Enable constant related optimization rules in Calcite

2015-09-30 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939053#comment-14939053
 ] 

Julian Hyde commented on HIVE-11918:


Slight correction. If upgrading is hard, it might be possible to create a 
copy-paste rule that goes around RexBuilder.

> Implement/Enable constant related optimization rules in Calcite
> ---
>
> Key: HIVE-11918
> URL: https://issues.apache.org/jira/browse/HIVE-11918
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now, Hive optimizer (Calcite) is short of the constant related 
> optimization rules. For example, constant folding, constant propagation and 
> constant transitive rules. Although Hive later provides those rules in the 
> logical optimizer, we would like to implement those inside Calcite. This will 
> benefit the current optimization as well as the optimization based on return 
> path that we are planning to use in the future. This JIRA is the umbrella 
> JIRA to implement/enable those rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11918) Implement/Enable constant related optimization rules in Calcite

2015-09-29 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14935624#comment-14935624
 ] 

Julian Hyde commented on HIVE-11918:


[~pxiong], I have a tentative fix for CALCITE-902 in 
https://github.com/julianhyde/incubator-calcite/commit/9adf259763a9c52a6db8d4a19425722fbedcaa6c.
 See whether this fix improves things in Hive.

> Implement/Enable constant related optimization rules in Calcite
> ---
>
> Key: HIVE-11918
> URL: https://issues.apache.org/jira/browse/HIVE-11918
> Project: Hive
>  Issue Type: Improvement
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>
> Right now, Hive optimizer (Calcite) is short of the constant related 
> optimization rules. For example, constant folding, constant propagation and 
> constant transitive rules. Although Hive later provides those rules in the 
> logical optimizer, we would like to implement those inside Calcite. This will 
> benefit the current optimization as well as the optimization based on return 
> path that we are planning to use in the future. This JIRA is the umbrella 
> JIRA to implement/enable those rules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-11383) Upgrade Hive to Calcite 1.4

2015-07-27 Thread Julian Hyde (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated HIVE-11383:
---
Attachment: HIVE-11383.1.patch

 Upgrade Hive to Calcite 1.4
 ---

 Key: HIVE-11383
 URL: https://issues.apache.org/jira/browse/HIVE-11383
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11383.1.patch


 Upgrade Hive to Calcite 1.4.0-incubating.
 There is currently a snapshot release, which is close to what will be in 1.4. 
 I have checked that Hive compiles against the new snapshot, fixing one issue. 
 The patch is attached.
 Next step is to validate that Hive runs against the new Calcite, and post any 
 issues to the Calcite list or log Calcite Jira cases. [~jcamachorodriguez], 
 can you please do that.
 [~pxiong], I gather you are dependent on CALCITE-814, which will be fixed in 
 the new Calcite version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11383) Upgrade Hive to Calcite 1.4

2015-07-27 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643270#comment-14643270
 ] 

Julian Hyde commented on HIVE-11383:


Agreed. We will not check in the patch as is. It was just a way of me sharing 
work with [~jcamachorodriguez].

 Upgrade Hive to Calcite 1.4
 ---

 Key: HIVE-11383
 URL: https://issues.apache.org/jira/browse/HIVE-11383
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11383.1.patch


 Upgrade Hive to Calcite 1.4.0-incubating.
 There is currently a snapshot release, which is close to what will be in 1.4. 
 I have checked that Hive compiles against the new snapshot, fixing one issue. 
 The patch is attached.
 Next step is to validate that Hive runs against the new Calcite, and post any 
 issues to the Calcite list or log Calcite Jira cases. [~jcamachorodriguez], 
 can you please do that.
 [~pxiong], I gather you are dependent on CALCITE-814, which will be fixed in 
 the new Calcite version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11383) Upgrade Hive to Calcite 1.4

2015-07-27 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14643235#comment-14643235
 ] 

Julian Hyde commented on HIVE-11383:


By the way, the issue that needed to be fixed was that the signature of 
createSort(RelTraitSet traits, RelNode input, RelCollation collation, RexNode 
offset, RexNode fetch) changed to createSort(RelNode input, RelCollation 
collation, RexNode offset, RexNode fetch). The old method still exists, 
deprecated, but HiveSortFactory needs to implement both.

 Upgrade Hive to Calcite 1.4
 ---

 Key: HIVE-11383
 URL: https://issues.apache.org/jira/browse/HIVE-11383
 Project: Hive
  Issue Type: Bug
Reporter: Julian Hyde
Assignee: Jesus Camacho Rodriguez
 Attachments: HIVE-11383.1.patch


 Upgrade Hive to Calcite 1.4.0-incubating.
 There is currently a snapshot release, which is close to what will be in 1.4. 
 I have checked that Hive compiles against the new snapshot, fixing one issue. 
 The patch is attached.
 Next step is to validate that Hive runs against the new Calcite, and post any 
 issues to the Calcite list or log Calcite Jira cases. [~jcamachorodriguez], 
 can you please do that.
 [~pxiong], I gather you are dependent on CALCITE-814, which will be fixed in 
 the new Calcite version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10636) CASE comparison operator rotation optimization

2015-05-18 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549594#comment-14549594
 ] 

Julian Hyde commented on HIVE-10636:


If you wrap every WHERE clause predicate (and other predicates such as HAVING) 
in ... IS TRUE you don't need to pass context.

 CASE comparison operator rotation optimization
 --

 Key: HIVE-10636
 URL: https://issues.apache.org/jira/browse/HIVE-10636
 Project: Hive
  Issue Type: New Feature
  Components: Logical Optimizer
Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 1.2.0

 Attachments: HIVE-10636.1.patch, HIVE-10636.2.patch, 
 HIVE-10636.3.patch, HIVE-10636.patch


 Step 1 as outlined in description of HIVE-9644



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10332) Use SortExchange rather than LogicalExchange for HiveOpConverter

2015-04-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495820#comment-14495820
 ] 

Julian Hyde commented on HIVE-10332:


No reason to remove LogicalExchange from Calcite, even if Hive doesn't use it.

We could create a LogicalSortExchange in Calcite if Hive would find it useful. 
(Unlike HiveSortExchange, it would have logical convention.)

 Use SortExchange rather than LogicalExchange for HiveOpConverter
 

 Key: HIVE-10332
 URL: https://issues.apache.org/jira/browse/HIVE-10332
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: cbo-branch

 Attachments: HIVE-10332.01.patch


 Right now HiveSortExchange extends SortExchange extends Exchange. 
 LogicalExchange extends Exchange. LogicalExchange is expected in 
 HiveOpConverter but HiveSortExchange is created. After discussion, we plan to 
 change LogicalExchange to HiveSortExchange.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

48 matches

Mail list logo