[jira] [Updated] (CALCITE-2204) Volcano Planner may not choose the cheapest cost plan

2018-03-08 Thread LeoWangLZ (JIRA)

 [ 
https://issues.apache.org/jira/browse/CALCITE-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LeoWangLZ updated CALCITE-2204:
---
Summary: Volcano Planner may not choose the cheapest cost plan   (was: 
Volcano Planner may not choose the cheapest cost plan wrongly)

> Volcano Planner may not choose the cheapest cost plan 
> --
>
> Key: CALCITE-2204
> URL: https://issues.apache.org/jira/browse/CALCITE-2204
> Project: Calcite
>  Issue Type: Bug
>Reporter: LeoWangLZ
>Assignee: Julian Hyde
>Priority: Major
>
> Volcano Planner will propagate the cost improvement to parents that one of 
> the inputs has the best plan. But it not propagate to all parents firstly, it 
> propagate one parent and go to the parent s of parent. In the way, if one 
> parent may propagate to other parent on the same level, and the cost maybe 
> less than others, but when propagate the parent again, it will not propagate 
> because the cost is already calculated, and it's can't be less then the cost 
> of the self.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2204) Volcano Planner may not choose the cheapest cost plan wrongly

2018-03-08 Thread LeoWangLZ (JIRA)

 [ 
https://issues.apache.org/jira/browse/CALCITE-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LeoWangLZ updated CALCITE-2204:
---
Summary: Volcano Planner may not choose the cheapest cost plan wrongly  
(was: Volcano Planner may not choose the cheapest cost of plan wrongly)

> Volcano Planner may not choose the cheapest cost plan wrongly
> -
>
> Key: CALCITE-2204
> URL: https://issues.apache.org/jira/browse/CALCITE-2204
> Project: Calcite
>  Issue Type: Bug
>Reporter: LeoWangLZ
>Assignee: Julian Hyde
>Priority: Major
>
> Volcano Planner will propagate the cost improvement to parents that one of 
> the inputs has the best plan. But it not propagate to all parents firstly, it 
> propagate one parent and go to the parent s of parent. In the way, if one 
> parent may propagate to other parent on the same level, and the cost maybe 
> less than others, but when propagate the parent again, it will not propagate 
> because the cost is already calculated, and it's can't be less then the cost 
> of the self.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2202) Aggregate Join Push-down on a Single Side

2018-03-08 Thread Zhong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392289#comment-16392289
 ] 

Zhong Yu edited comment on CALCITE-2202 at 3/9/18 2:29 AM:
---

Everything is moot if I can not prove my formula. But suppose it is correct –

I do think that COVAR_POP can be pushed down; it can be calculate from 
SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through 
table union and cross product, therefore can be pushed down over join.

Producing more candidate plans may be bad for CBO; but the extra rule (i.e. 
singled sided) can be opt-in in some cases where metadata is missing, and stats 
shows that group columns are unique or nearly unique.


was (Author: zhong.j.yu):
Everything is moot if I can not prove my formula. But suppose it is correct –

I do think that COVAR_POP can be pushed down; it can be calculate from 
SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through 
table union and cross product, therefore can be pushed down over join.

Producing more candidate plans may be bad for CBO; but the extra rule (i.e. 
singled sided) can be opted in some cases where metadata is missing, or group 
columns are nearly unique.

> Aggregate Join Push-down on a Single Side
> -
>
> Key: CALCITE-2202
> URL: https://issues.apache.org/jira/browse/CALCITE-2202
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: next
>Reporter: Zhong Yu
>Assignee: Julian Hyde
>Priority: Major
> Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's 
> apparent that aggregation can be pushed on on a single side (either side), 
> and leave the other side non-aggregated, regardless of whether grouping 
> columns are unique on the other side. My analysis – 
> [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try 
> to provide all 3 possible transformations (aggregate on left only; right 
> only; both sides) to the cost based optimizer, so that the cheapest one can 
> be chosen based on stats. 
> Does this make any sense, anybody? If it sounds good, I'll implement it and 
> offer a PR. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CALCITE-2202) Aggregate Join Push-down on a Single Side

2018-03-08 Thread Zhong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392289#comment-16392289
 ] 

Zhong Yu edited comment on CALCITE-2202 at 3/9/18 2:27 AM:
---

Everything is moot if I can not prove my formula. But suppose it is correct –

I do think that COVAR_POP can be pushed down; it can be calculate from 
SUM(x*y), SUM( x ), SUM( y ), COUNT(x,y), all of which can be split through 
table union and cross product, therefore can be pushed down over join.

Producing more candidate plans may be bad for CBO; but the extra rule (i.e. 
singled sided) can be opted in some cases where metadata is missing, or group 
columns are nearly unique.


was (Author: zhong.j.yu):
Everything is moot if I can not prove my formula. But suppose it is correct --

I do think that COVAR_POP can be pushed down; it can be calculate from 
SUM(x*y), SUM(x), SUM(y), COUNT(x,y), all of which can be split through table 
union and cross product, therefore can be pushed down over join.

Producing more candidate plans may be bad for CBO; but the extra rule (i.e. 
singled sided) can be opted in some cases where metadata is missing, or group 
columns are nearly unique.

> Aggregate Join Push-down on a Single Side
> -
>
> Key: CALCITE-2202
> URL: https://issues.apache.org/jira/browse/CALCITE-2202
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: next
>Reporter: Zhong Yu
>Assignee: Julian Hyde
>Priority: Major
> Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's 
> apparent that aggregation can be pushed on on a single side (either side), 
> and leave the other side non-aggregated, regardless of whether grouping 
> columns are unique on the other side. My analysis – 
> [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try 
> to provide all 3 possible transformations (aggregate on left only; right 
> only; both sides) to the cost based optimizer, so that the cheapest one can 
> be chosen based on stats. 
> Does this make any sense, anybody? If it sounds good, I'll implement it and 
> offer a PR. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-1806) UnsupportedOperationException accessing Druid through Spark JDBC

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392285#comment-16392285
 ] 

ASF GitHub Bot commented on CALCITE-1806:
-

Github user risdenk commented on the issue:

https://github.com/apache/calcite-avatica/pull/28
  
```
Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 7.373 sec 
<<< FAILURE! - in org.apache.calcite.avatica.remote.SparkClientTest
testSpark[JSON](org.apache.calcite.avatica.remote.SparkClientTest)  Time 
elapsed: 6.864 sec  <<< FAILURE!
java.lang.AssertionError: expected:<1> but was:<0>
at 
org.apache.calcite.avatica.remote.SparkClientTest.testSpark(SparkClientTest.java:108)
testSpark[PROTOBUF](org.apache.calcite.avatica.remote.SparkClientTest)  
Time elapsed: 0.48 sec  <<< FAILURE!
java.lang.AssertionError: expected:<1> but was:<0>
at 
org.apache.calcite.avatica.remote.SparkClientTest.testSpark(SparkClientTest.java:108)
```


> UnsupportedOperationException accessing Druid through Spark JDBC
> 
>
> Key: CALCITE-1806
> URL: https://issues.apache.org/jira/browse/CALCITE-1806
> Project: Calcite
>  Issue Type: Bug
>  Components: avatica
>Affects Versions: avatica-1.9.0
> Environment: Spark 1.6, scala 10, CDH 5.7
>Reporter: Benjamin Vogan
>Priority: Major
>
> I am interested in querying Druid via Spark.  I realize that there are 
> several mechanisms for doing this, but I was curious about using the JDBC 
> batch offered by the latest release as it is familiar to our analysts and 
> seems like it should be a well supported path going forward.
> My first attempt has failed with an UnsupportedOperationException.  I ran 
> spark-shell with the --jars option to add the avatica 1.9.0 jdbc driver jar.
> {noformat}
> scala> val dw2 = sqlContext.read.format("jdbc").options(Map("url" -> 
> "jdbc:avatica:remote:url=http://jarvis-druid-query002:8082/druid/v2/sql/avatica/;,
>  "dbtable" -> "sor_business_events_all", "driver" -> 
> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"1")).load()
> java.lang.UnsupportedOperationException
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:275)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:121)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:122)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:91)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
>   at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:25)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:30)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:32)
>   at $iwC$$iwC$$iwC$$iwC$$iwC.(:34)
>   at $iwC$$iwC$$iwC$$iwC.(:36)
>   at $iwC$$iwC$$iwC.(:38)
>   at $iwC$$iwC.(:40)
>   at $iwC.(:42)
>   at (:44)
>   at .(:48)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045)
>   at 
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326)
>   at 
> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800)
>   at 
> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>   at 
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>   at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>   at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>   at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>   at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>   at 
> 

[jira] [Commented] (CALCITE-2207) Enforce Java version via maven-enforcer-plugin

2018-03-08 Thread Kevin Risden (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392188#comment-16392188
 ] 

Kevin Risden commented on CALCITE-2207:
---

Should remove openjdk7 from .travis.yml as well. 

> Enforce Java version via maven-enforcer-plugin
> --
>
> Key: CALCITE-2207
> URL: https://issues.apache.org/jira/browse/CALCITE-2207
> Project: Calcite
>  Issue Type: Task
>  Components: avatica
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Fix For: avatica-1.12.0
>
>
> Now that jdk7 support has been dropped, we should add some logic to the build 
> to fail obviously when a version of Java is used that we don't support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-1806) UnsupportedOperationException accessing Druid through Spark JDBC

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392187#comment-16392187
 ] 

ASF GitHub Bot commented on CALCITE-1806:
-

Github user risdenk commented on the issue:

https://github.com/apache/calcite-avatica/pull/28
  
I'm expecting the build to fail with 0 rows being returned when there is 
actually a row there. Debug logging is turned on for the SparkSession to try to 
get some more information for this. 


> UnsupportedOperationException accessing Druid through Spark JDBC
> 
>
> Key: CALCITE-1806
> URL: https://issues.apache.org/jira/browse/CALCITE-1806
> Project: Calcite
>  Issue Type: Bug
>  Components: avatica
>Affects Versions: avatica-1.9.0
> Environment: Spark 1.6, scala 10, CDH 5.7
>Reporter: Benjamin Vogan
>Priority: Major
>
> I am interested in querying Druid via Spark.  I realize that there are 
> several mechanisms for doing this, but I was curious about using the JDBC 
> batch offered by the latest release as it is familiar to our analysts and 
> seems like it should be a well supported path going forward.
> My first attempt has failed with an UnsupportedOperationException.  I ran 
> spark-shell with the --jars option to add the avatica 1.9.0 jdbc driver jar.
> {noformat}
> scala> val dw2 = sqlContext.read.format("jdbc").options(Map("url" -> 
> "jdbc:avatica:remote:url=http://jarvis-druid-query002:8082/druid/v2/sql/avatica/;,
>  "dbtable" -> "sor_business_events_all", "driver" -> 
> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"1")).load()
> java.lang.UnsupportedOperationException
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:275)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:121)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:122)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:91)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
>   at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:25)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:30)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:32)
>   at $iwC$$iwC$$iwC$$iwC$$iwC.(:34)
>   at $iwC$$iwC$$iwC$$iwC.(:36)
>   at $iwC$$iwC$$iwC.(:38)
>   at $iwC$$iwC.(:40)
>   at $iwC.(:42)
>   at (:44)
>   at .(:48)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045)
>   at 
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326)
>   at 
> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800)
>   at 
> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>   at 
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>   at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>   at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>   at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>   at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>   at 
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>   at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1064)
>   at org.apache.spark.repl.Main$.main(Main.scala:31)
>   at 

[jira] [Commented] (CALCITE-1806) UnsupportedOperationException accessing Druid through Spark JDBC

2018-03-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392184#comment-16392184
 ] 

ASF GitHub Bot commented on CALCITE-1806:
-

GitHub user risdenk opened a pull request:

https://github.com/apache/calcite-avatica/pull/28

[WIP][CALCITE-1806] Add Apache Spark JDBC test to Avatica server

This branch is a work in progress to show how Apache Spark and Avatica 
don't seem to be playing along nicely together. Spark JDBC against Avatica 
returns an empty result even though it determines the correct schema.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/risdenk/calcite-avatica CALCITE-1806

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/calcite-avatica/pull/28.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #28


commit 311ed286cdf2807e8daf84b52b5956a4b54d0a74
Author: Kevin Risden 
Date:   2018-03-09T00:41:54Z

[CALCITE-1806] Add Apache Spark JDBC test to Avatica server




> UnsupportedOperationException accessing Druid through Spark JDBC
> 
>
> Key: CALCITE-1806
> URL: https://issues.apache.org/jira/browse/CALCITE-1806
> Project: Calcite
>  Issue Type: Bug
>  Components: avatica
>Affects Versions: avatica-1.9.0
> Environment: Spark 1.6, scala 10, CDH 5.7
>Reporter: Benjamin Vogan
>Priority: Major
>
> I am interested in querying Druid via Spark.  I realize that there are 
> several mechanisms for doing this, but I was curious about using the JDBC 
> batch offered by the latest release as it is familiar to our analysts and 
> seems like it should be a well supported path going forward.
> My first attempt has failed with an UnsupportedOperationException.  I ran 
> spark-shell with the --jars option to add the avatica 1.9.0 jdbc driver jar.
> {noformat}
> scala> val dw2 = sqlContext.read.format("jdbc").options(Map("url" -> 
> "jdbc:avatica:remote:url=http://jarvis-druid-query002:8082/druid/v2/sql/avatica/;,
>  "dbtable" -> "sor_business_events_all", "driver" -> 
> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"1")).load()
> java.lang.UnsupportedOperationException
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:275)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:121)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:122)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:91)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
>   at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:25)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:30)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:32)
>   at $iwC$$iwC$$iwC$$iwC$$iwC.(:34)
>   at $iwC$$iwC$$iwC$$iwC.(:36)
>   at $iwC$$iwC$$iwC.(:38)
>   at $iwC$$iwC.(:40)
>   at $iwC.(:42)
>   at (:44)
>   at .(:48)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045)
>   at 
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326)
>   at 
> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800)
>   at 
> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>   at 
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>   at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>   at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>   at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>   at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>   at 
> 

[jira] [Commented] (CALCITE-1806) UnsupportedOperationException accessing Druid through Spark JDBC

2018-03-08 Thread Kevin Risden (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392157#comment-16392157
 ] 

Kevin Risden commented on CALCITE-1806:
---

So I have good news and bad news on this. I can't reproduce the error from the 
original reporter. However, no data seems to be returned even though there is 
no error. I'm working on a test case now to show this.

> UnsupportedOperationException accessing Druid through Spark JDBC
> 
>
> Key: CALCITE-1806
> URL: https://issues.apache.org/jira/browse/CALCITE-1806
> Project: Calcite
>  Issue Type: Bug
>  Components: avatica
>Affects Versions: avatica-1.9.0
> Environment: Spark 1.6, scala 10, CDH 5.7
>Reporter: Benjamin Vogan
>Priority: Major
>
> I am interested in querying Druid via Spark.  I realize that there are 
> several mechanisms for doing this, but I was curious about using the JDBC 
> batch offered by the latest release as it is familiar to our analysts and 
> seems like it should be a well supported path going forward.
> My first attempt has failed with an UnsupportedOperationException.  I ran 
> spark-shell with the --jars option to add the avatica 1.9.0 jdbc driver jar.
> {noformat}
> scala> val dw2 = sqlContext.read.format("jdbc").options(Map("url" -> 
> "jdbc:avatica:remote:url=http://jarvis-druid-query002:8082/druid/v2/sql/avatica/;,
>  "dbtable" -> "sor_business_events_all", "driver" -> 
> "org.apache.calcite.avatica.remote.Driver", "fetchSize"->"1")).load()
> java.lang.UnsupportedOperationException
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:275)
>   at 
> org.apache.calcite.avatica.AvaticaConnection.prepareStatement(AvaticaConnection.java:121)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:122)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:91)
>   at 
> org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
>   at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
>   at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:25)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:30)
>   at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:32)
>   at $iwC$$iwC$$iwC$$iwC$$iwC.(:34)
>   at $iwC$$iwC$$iwC$$iwC.(:36)
>   at $iwC$$iwC$$iwC.(:38)
>   at $iwC$$iwC.(:40)
>   at $iwC.(:42)
>   at (:44)
>   at .(:48)
>   at .()
>   at .(:7)
>   at .()
>   at $print()
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1045)
>   at 
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1326)
>   at 
> org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:821)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:852)
>   at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:800)
>   at 
> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>   at 
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>   at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>   at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>   at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>   at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>   at 
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>   at 
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>   at 
> org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>   at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1064)
>   at org.apache.spark.repl.Main$.main(Main.scala:31)
>   at org.apache.spark.repl.Main.main(Main.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   

[jira] [Comment Edited] (CALCITE-2194) Ability to hide a schema

2018-03-08 Thread Piotr Bojko (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392092#comment-16392092
 ] 

Piotr Bojko edited comment on CALCITE-2194 at 3/8/18 11:33 PM:
---

Thx for quick feedback.
* reformatting is my bad, will try to fix
* periods - to be fixed
* RexImpTable - it was a hocus pocus to me, but suddenly the files just blows 
with errors. Totally forgot to look at it after the poc was done. My bad - will 
inwestigate this further
* Full access instead hidden feature - you're the boss, win win :)
* as for Fairy - I'm not stick to names I can rename it. ThreadLocal like 
implementation was the quickest solution to drag the principal for almost all 
of the calcite stack - spring implements such features in the same threadlocal 
way. Any suggestion how can propagate the principal and I would try to ship it. 
 
* System.getProperty - previous implementation of 
USER/CURRENT_USER/SESSION_USER had used "user.name" property. I've added the 
fallback to previous behaviour in case an end user has not provided the user on 
the connection.
* boolean indirect and INDIRECT_SELECT was provided to implement my needed 
case. You map the user to INDIRECT_SELECT in schema now and this user can use 
such schema only through views from another schema. You can map user to SELECT 
in schema (but not INDIRECT_SELECT) and such user can use schema only through 
direct selects and not through views from other schemas.
* .gitgnore - will fix

What do you think in general about the change? Do you consider to accept the 
change to calcite after patching you doubts? For me only the 
"SqlAccessEnum.INDIRECT_SELECT" change is something more than an implementation 
and it is your call whether to accept another access type or to look to 
redesign my change.  
If in general the change has a good direction I would how to design my project 
in which calcite plays a role. Thx in advance :) 


was (Author: ptrbojko):
Thx for quick feedback.
* reformatting is my bad, will try to fix
* periods - to be fixed
* RexImpTable - it was a hocus pocus to me, but suddenly the files just blows 
with errors. Totally forgot to look at it after the poc was done. My bad - will 
inwestigate this further
* Full access instead hidden feature - you're the boss, win win :)
* as for Fairy - I'm not stick to names I can rename it. ThreadLocal like 
implementation was the quickest solution to drag the principal for almost all 
of the calcite stack - spring implements such features in the same threadlocal 
way. Any suggestion how can propagate the principal and I would try to ship it. 
 
* System.getProperty - previous implementation of 
USER/CURRENT_USER/SESSION_USER had used "user.name" property. I've added the 
fallback to previous behaviour in case an end user has not provided the user on 
the connection.
* boolean indirect and INDIRECT_SELECT was provided to implement my needed 
case. You map the user to INDIRECT_SELECT in schema now and this user can use 
such schema only through views from another schema. You can map user to SELECT 
in schema (but not INDIRECT_SELECT) and such user can use schema only through 
direct selects and not through views from other schemas.
* .gitgnore - will fix

What do you think in general about the change? Do you consider to accept the 
change to calcite after patching you doubts? For me only the 
"SqlAccessEnum.INDIRECT_SELECT" change is something more than an implementation 
and it is your call whether to accept another access type or to look to 
redesign my change.  

> Ability to hide a schema
> 
>
> Key: CALCITE-2194
> URL: https://issues.apache.org/jira/browse/CALCITE-2194
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Piotr Bojko
>Assignee: Piotr Bojko
>Priority: Minor
>
> See: 
> [https://mail-archives.apache.org/mod_mbox/calcite-dev/201711.mbox/ajax/%3C6F6E52D4-6860-4384-A1CB-A2301D05394D%40apache.org%3E]
> I've looked into the core and the notion of an user could be hard to achieved 
> now. 
> Though, I am able to implement the "hidden schema" feature through following 
> changes:
>  # JsonSchema - add a holder for the feature, boolean flag or flags field 
> with enum (CACHED which now exists as a separate flag - some deprecation 
> could be needed, HIDDEN)
>  # CalciteSchema - pass through of a flag
>  # RelOptSchema - pass through of a flag
>  # CalciteCatalogReader - pass through of a flag
>  # Other derivatives of RelOptSchema - mocked value, false
>  # RelOptTable and impl - pass through of a flag
>  # SqlValidatorImpl - validation whether object from hidden schema is used 
> (in the same places like validateAccess)
>  # ViewTableMacro.apply ->  Schemas.analyzeView -> 
> CalcitePrepareImpl.analyzeView -> 

[jira] [Commented] (CALCITE-2194) Ability to hide a schema

2018-03-08 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392110#comment-16392110
 ] 

Julian Hyde commented on CALCITE-2194:
--

At a very high level it looks good - the "AuthorizationGuard" and 
"CalcitePrincipal" classes are exactly as I would have hoped. However I can't 
comment further until I take the time to grok the implementation.

> Ability to hide a schema
> 
>
> Key: CALCITE-2194
> URL: https://issues.apache.org/jira/browse/CALCITE-2194
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Piotr Bojko
>Assignee: Piotr Bojko
>Priority: Minor
>
> See: 
> [https://mail-archives.apache.org/mod_mbox/calcite-dev/201711.mbox/ajax/%3C6F6E52D4-6860-4384-A1CB-A2301D05394D%40apache.org%3E]
> I've looked into the core and the notion of an user could be hard to achieved 
> now. 
> Though, I am able to implement the "hidden schema" feature through following 
> changes:
>  # JsonSchema - add a holder for the feature, boolean flag or flags field 
> with enum (CACHED which now exists as a separate flag - some deprecation 
> could be needed, HIDDEN)
>  # CalciteSchema - pass through of a flag
>  # RelOptSchema - pass through of a flag
>  # CalciteCatalogReader - pass through of a flag
>  # Other derivatives of RelOptSchema - mocked value, false
>  # RelOptTable and impl - pass through of a flag
>  # SqlValidatorImpl - validation whether object from hidden schema is used 
> (in the same places like validateAccess)
>  # ViewTableMacro.apply ->  Schemas.analyzeView -> 
> CalcitePrepareImpl.analyzeView -> CalcitePrepareImpl.parse_ -> 
> CalcitePrepareImpl.CalcitePrepareImpl - this path of execution should build 
> SqlValidatorImpl which has the check from point 7 disabled- 
> Such feature could be useful for end users. 
> If the solution is ok - I can contribute it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2208) MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case sensitivity for end user

2018-03-08 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392095#comment-16392095
 ] 

Julian Hyde commented on CALCITE-2208:
--

The materialized view doesn't belong to the current connection, nor does it 
necessarily "belong to the schema". The current implementation is 
quick-and-dirty but it doesn't need to be replaced with a different 
quick-and-dirty.

Materialized views are not exactly the same as views, but I'll remark that for 
a view, the right thing is to remember the environment where the view was 
created (e.g. the lexical convention in use, and the access control 
environment) and apply the same environment when the view is used (i.e. 
expanded in a query). Maybe for materialized views we need to capture the 
environment when they were created. MATERIALIZATION_CONNECTION should be a 
connection factory (perhaps pooling behind the scenes, perhaps not) that can 
create connections with a particular environment. Perhaps they're not even 
full-blown connections, just a context in which a query can be prepared.

> MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case 
> sensitivity for end user
> -
>
> Key: CALCITE-2208
> URL: https://issues.apache.org/jira/browse/CALCITE-2208
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Piotr Bojko
>Assignee: Julian Hyde
>Priority: Major
>
> MaterializedViewTable.MATERIALIZATION_CONNECTION used for validating views 
> uses ORACLE lex by default. Calcite expands the view sql to uppercase so when 
> schemas used in such view sql are used are declared in lowercase - Calcite 
> does not find needed objects to resolve and validate the view sql.
> It does really not work even when end user creates connection with 
> lex=oracle, but uses uppercase for the names of its tables. 
> It would be best when MaterializedViewTable.MATERIALIZATION_CONNECTION would 
> be replaced by connection of an end user or dynamically created connection 
> with passed lex from end user connection. 
> Quick and dirty solution is to create 
> MaterializedViewTable.MATERIALIZATION_CONNECTION with caseSensitive=false;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2194) Ability to hide a schema

2018-03-08 Thread Piotr Bojko (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392092#comment-16392092
 ] 

Piotr Bojko commented on CALCITE-2194:
--

Thx for quick feedback.
* reformatting is my bad, will try to fix
* periods - to be fixed
* RexImpTable - it was a hocus pocus to me, but suddenly the files just blows 
with errors. Totally forgot to look at it after the poc was done. My bad - will 
inwestigate this further
* Full access instead hidden feature - you're the boss, win win :)
* as for Fairy - I'm not stick to names I can rename it. ThreadLocal like 
implementation was the quickest solution to drag the principal for almost all 
of the calcite stack - spring implements such features in the same threadlocal 
way. Any suggestion how can propagate the principal and I would try to ship it. 
 
* System.getProperty - previous implementation of 
USER/CURRENT_USER/SESSION_USER had used "user.name" property. I've added the 
fallback to previous behaviour in case an end user has not provided the user on 
the connection.
* boolean indirect and INDIRECT_SELECT was provided to implement my needed 
case. You map the user to INDIRECT_SELECT in schema now and this user can use 
such schema only through views from another schema. You can map user to SELECT 
in schema (but not INDIRECT_SELECT) and such user can use schema only through 
direct selects and not through views from other schemas.
* .gitgnore - will fix

What do you think in general about the change? Do you consider to accept the 
change to calcite after patching you doubts? For me only the 
"SqlAccessEnum.INDIRECT_SELECT" change is something more than an implementation 
and it is your call whether to accept another access type or to look to 
redesign my change.  

> Ability to hide a schema
> 
>
> Key: CALCITE-2194
> URL: https://issues.apache.org/jira/browse/CALCITE-2194
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Piotr Bojko
>Assignee: Piotr Bojko
>Priority: Minor
>
> See: 
> [https://mail-archives.apache.org/mod_mbox/calcite-dev/201711.mbox/ajax/%3C6F6E52D4-6860-4384-A1CB-A2301D05394D%40apache.org%3E]
> I've looked into the core and the notion of an user could be hard to achieved 
> now. 
> Though, I am able to implement the "hidden schema" feature through following 
> changes:
>  # JsonSchema - add a holder for the feature, boolean flag or flags field 
> with enum (CACHED which now exists as a separate flag - some deprecation 
> could be needed, HIDDEN)
>  # CalciteSchema - pass through of a flag
>  # RelOptSchema - pass through of a flag
>  # CalciteCatalogReader - pass through of a flag
>  # Other derivatives of RelOptSchema - mocked value, false
>  # RelOptTable and impl - pass through of a flag
>  # SqlValidatorImpl - validation whether object from hidden schema is used 
> (in the same places like validateAccess)
>  # ViewTableMacro.apply ->  Schemas.analyzeView -> 
> CalcitePrepareImpl.analyzeView -> CalcitePrepareImpl.parse_ -> 
> CalcitePrepareImpl.CalcitePrepareImpl - this path of execution should build 
> SqlValidatorImpl which has the check from point 7 disabled- 
> Such feature could be useful for end users. 
> If the solution is ok - I can contribute it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2202) Aggregate Join Push-down on a Single Side

2018-03-08 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392086#comment-16392086
 ] 

Julian Hyde commented on CALCITE-2202:
--

Maybe I've been out of academia and in engineering for too long, but I prefer a 
few well-chosen examples than a formal proof. Some examples stake out the 
corner cases (e.g. empty group by) and other examples are easily generalized 
(e.g. if something applies to MIN it applies to MAX and similar functions, and 
if something that applies to AVG it applies to STDEV and similar functions).

Are you claiming that it is better to push down on only one side? (I could see 
how it would be simpler, therefore better, if you could push down to the left 
first, then push down to the right.) I contend that the desired result is 
{code}select sum(e.s * d.c)
from (select deptno, sum(sal) as s from emp group by deptno) as e
join (select deptno, count(*) as c from dept group by deptno) as d
on e.deptno = d.deptno
group by e.deptno{code} (I mistakenly omitted the "group by deptno" in the two 
inner queries last time) and I wonder whether you could get to that by applying 
your methods.

Most aggregate functions we are familiar with have one argument, but would your 
methods apply to aggregate functions with more than one? Let's consider COUNT 
and COVAR_POP, both of which can take two arguments, and ask whether they can 
be pushed down if they have one argument from each side of a join.

> Aggregate Join Push-down on a Single Side
> -
>
> Key: CALCITE-2202
> URL: https://issues.apache.org/jira/browse/CALCITE-2202
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: next
>Reporter: Zhong Yu
>Assignee: Julian Hyde
>Priority: Major
> Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's 
> apparent that aggregation can be pushed on on a single side (either side), 
> and leave the other side non-aggregated, regardless of whether grouping 
> columns are unique on the other side. My analysis – 
> [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try 
> to provide all 3 possible transformations (aggregate on left only; right 
> only; both sides) to the cost based optimizer, so that the cheapest one can 
> be chosen based on stats. 
> Does this make any sense, anybody? If it sounds good, I'll implement it and 
> offer a PR. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2194) Ability to hide a schema

2018-03-08 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16392066#comment-16392066
 ] 

Julian Hyde commented on CALCITE-2194:
--

I did a quick pass; observations:
* Quite a lot of re-formatting. House style is '=' at end of line, indent 2 or 
4. Please stick to it.
* Periods at ends of sentences please.
* Why exclude RexImpTable.java from checkstyle?
* Why is SqlAccessEnum.INDIRECT_SELECT necessary?
* Thanks for implementing full access control rather than just "hidden", which 
I know was your original preference
* You have made RelOptTableImpl no longer immutable
* I don't like the name Fairy, and I don't like the general practice of putting 
config in thread-locals. Are there any alternatives?
* There are a couple of places that call {{System.getProperty("user.name")}}. 
Not appropriate for a server/library.
* "boolean indirect" parameter is added to several methods, never explained
* Changes to .gitignore are a mess


> Ability to hide a schema
> 
>
> Key: CALCITE-2194
> URL: https://issues.apache.org/jira/browse/CALCITE-2194
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Piotr Bojko
>Assignee: Piotr Bojko
>Priority: Minor
>
> See: 
> [https://mail-archives.apache.org/mod_mbox/calcite-dev/201711.mbox/ajax/%3C6F6E52D4-6860-4384-A1CB-A2301D05394D%40apache.org%3E]
> I've looked into the core and the notion of an user could be hard to achieved 
> now. 
> Though, I am able to implement the "hidden schema" feature through following 
> changes:
>  # JsonSchema - add a holder for the feature, boolean flag or flags field 
> with enum (CACHED which now exists as a separate flag - some deprecation 
> could be needed, HIDDEN)
>  # CalciteSchema - pass through of a flag
>  # RelOptSchema - pass through of a flag
>  # CalciteCatalogReader - pass through of a flag
>  # Other derivatives of RelOptSchema - mocked value, false
>  # RelOptTable and impl - pass through of a flag
>  # SqlValidatorImpl - validation whether object from hidden schema is used 
> (in the same places like validateAccess)
>  # ViewTableMacro.apply ->  Schemas.analyzeView -> 
> CalcitePrepareImpl.analyzeView -> CalcitePrepareImpl.parse_ -> 
> CalcitePrepareImpl.CalcitePrepareImpl - this path of execution should build 
> SqlValidatorImpl which has the check from point 7 disabled- 
> Such feature could be useful for end users. 
> If the solution is ok - I can contribute it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2194) Ability to hide a schema

2018-03-08 Thread Piotr Bojko (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391394#comment-16391394
 ] 

Piotr Bojko commented on CALCITE-2194:
--

I've shipped the contribution here https://github.com/apache/calcite/pull/647

> Ability to hide a schema
> 
>
> Key: CALCITE-2194
> URL: https://issues.apache.org/jira/browse/CALCITE-2194
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.16.0
>Reporter: Piotr Bojko
>Assignee: Piotr Bojko
>Priority: Minor
>
> See: 
> [https://mail-archives.apache.org/mod_mbox/calcite-dev/201711.mbox/ajax/%3C6F6E52D4-6860-4384-A1CB-A2301D05394D%40apache.org%3E]
> I've looked into the core and the notion of an user could be hard to achieved 
> now. 
> Though, I am able to implement the "hidden schema" feature through following 
> changes:
>  # JsonSchema - add a holder for the feature, boolean flag or flags field 
> with enum (CACHED which now exists as a separate flag - some deprecation 
> could be needed, HIDDEN)
>  # CalciteSchema - pass through of a flag
>  # RelOptSchema - pass through of a flag
>  # CalciteCatalogReader - pass through of a flag
>  # Other derivatives of RelOptSchema - mocked value, false
>  # RelOptTable and impl - pass through of a flag
>  # SqlValidatorImpl - validation whether object from hidden schema is used 
> (in the same places like validateAccess)
>  # ViewTableMacro.apply ->  Schemas.analyzeView -> 
> CalcitePrepareImpl.analyzeView -> CalcitePrepareImpl.parse_ -> 
> CalcitePrepareImpl.CalcitePrepareImpl - this path of execution should build 
> SqlValidatorImpl which has the check from point 7 disabled- 
> Such feature could be useful for end users. 
> If the solution is ok - I can contribute it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2208) MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case sensitivity for end user

2018-03-08 Thread Piotr Bojko (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391309#comment-16391309
 ] 

Piotr Bojko commented on CALCITE-2208:
--

See https://github.com/apache/calcite/pull/647

> MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case 
> sensitivity for end user
> -
>
> Key: CALCITE-2208
> URL: https://issues.apache.org/jira/browse/CALCITE-2208
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Piotr Bojko
>Assignee: Julian Hyde
>Priority: Major
>
> MaterializedViewTable.MATERIALIZATION_CONNECTION used for validating views 
> uses ORACLE lex by default. Calcite expands the view sql to uppercase so when 
> schemas used in such view sql are used are declared in lowercase - Calcite 
> does not find needed objects to resolve and validate the view sql.
> It does really not work even when end user creates connection with 
> lex=oracle, but uses uppercase for the names of its tables. 
> It would be best when MaterializedViewTable.MATERIALIZATION_CONNECTION would 
> be replaced by connection of an end user or dynamically created connection 
> with passed lex from end user connection. 
> Quick and dirty solution is to create 
> MaterializedViewTable.MATERIALIZATION_CONNECTION with caseSensitive=false;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2202) Aggregate Join Push-down on a Single Side

2018-03-08 Thread Zhong Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391279#comment-16391279
 ] 

Zhong Yu commented on CALCITE-2202:
---

Thanks Julian. Your last example actually works in my favor. My argument is 
that the original query is equivalent to the following two, where aggregate is 
pushed down on only one side
{code:java}
select sum(e.s * d.c)
from (select deptno, (sal)as s from emp) as e
join (select deptno, count(*) as c from dept group by deptno) as d
on e.deptno = d.deptno
group by e.deptno

select sum(e.s * d.c)
from (select deptno, sum(sal) as s from emp group by deptno) as e
join (select deptno, (1)  as c from dept) as d
on e.deptno = d.deptno
group by e.deptno{code}
The "cross-multiplier" effect is still there because the join multiplies the 
side that doesn't do aggregate.

So, I'm probably on the right track. However the paper I wrote is full of 
holes, obviously done by an amateur:) Please ignore it. I'll try to write up a 
new one in the weekend. Your reminder of vacuous cases, and null handling 
(different in group-by and equijoin) are all good points. And I'll focus on a 
narrower proof that works only on inner equijoin. 

 

> Aggregate Join Push-down on a Single Side
> -
>
> Key: CALCITE-2202
> URL: https://issues.apache.org/jira/browse/CALCITE-2202
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: next
>Reporter: Zhong Yu
>Assignee: Julian Hyde
>Priority: Major
> Fix For: next
>
>
> While investigating https://issues.apache.org/jira/browse/CALCITE-2195, it's 
> apparent that aggregation can be pushed on on a single side (either side), 
> and leave the other side non-aggregated, regardless of whether grouping 
> columns are unique on the other side. My analysis – 
> [http://zhong-j-yu.github.io/aggregate-join-push-down.pdf] .
> This may be useful when the metadata is insufficient; in any case, we may try 
> to provide all 3 possible transformations (aggregate on left only; right 
> only; both sides) to the cost based optimizer, so that the cheapest one can 
> be chosen based on stats. 
> Does this make any sense, anybody? If it sounds good, I'll implement it and 
> offer a PR. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2208) MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case sensitivity for end user

2018-03-08 Thread Piotr Bojko (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391266#comment-16391266
 ] 

Piotr Bojko commented on CALCITE-2208:
--

I am planning to add the mentioned workaround when working on CALCITE-2194 - 
disabling case sensitivity on MaterializedViewTable.MATERIALIZATION_CONNECTION

> MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case 
> sensitivity for end user
> -
>
> Key: CALCITE-2208
> URL: https://issues.apache.org/jira/browse/CALCITE-2208
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Piotr Bojko
>Assignee: Julian Hyde
>Priority: Major
>
> MaterializedViewTable.MATERIALIZATION_CONNECTION used for validating views 
> uses ORACLE lex by default. Calcite expands the view sql to uppercase so when 
> schemas used in such view sql are used are declared in lowercase - Calcite 
> does not find needed objects to resolve and validate the view sql.
> It does really not work even when end user creates connection with 
> lex=oracle, but uses uppercase for the names of its tables. 
> It would be best when MaterializedViewTable.MATERIALIZATION_CONNECTION would 
> be replaced by connection of an end user or dynamically created connection 
> with passed lex from end user connection. 
> Quick and dirty solution is to create 
> MaterializedViewTable.MATERIALIZATION_CONNECTION with caseSensitive=false;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CALCITE-2208) MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case sensitivity for end user

2018-03-08 Thread Piotr Bojko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CALCITE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Bojko updated CALCITE-2208:
-
Issue Type: Bug  (was: New Feature)

> MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case 
> sensitivity for end user
> -
>
> Key: CALCITE-2208
> URL: https://issues.apache.org/jira/browse/CALCITE-2208
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Piotr Bojko
>Assignee: Julian Hyde
>Priority: Major
>
> MaterializedViewTable.MATERIALIZATION_CONNECTION used for validating views 
> uses ORACLE lex by default. Calcite expands the view sql to uppercase so when 
> schemas used in such view sql are used are declared in lowercase - Calcite 
> does not find needed objects to resolve and validate the view sql.
> It does really not work even when end user creates connection with 
> lex=oracle, but uses uppercase for the names of its tables. 
> It would be best when MaterializedViewTable.MATERIALIZATION_CONNECTION would 
> be replaced by connection of an end user or dynamically created connection 
> with passed lex from end user connection. 
> Quick and dirty solution is to create 
> MaterializedViewTable.MATERIALIZATION_CONNECTION with caseSensitive=false;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CALCITE-2208) MaterializedViewTable.MATERIALIZATION_CONNECTION breaks lex and case sensitivity for end user

2018-03-08 Thread Piotr Bojko (JIRA)
Piotr Bojko created CALCITE-2208:


 Summary: MaterializedViewTable.MATERIALIZATION_CONNECTION breaks 
lex and case sensitivity for end user
 Key: CALCITE-2208
 URL: https://issues.apache.org/jira/browse/CALCITE-2208
 Project: Calcite
  Issue Type: New Feature
  Components: core
Affects Versions: 1.15.0, 1.16.0
Reporter: Piotr Bojko
Assignee: Julian Hyde


MaterializedViewTable.MATERIALIZATION_CONNECTION used for validating views uses 
ORACLE lex by default. Calcite expands the view sql to uppercase so when 
schemas used in such view sql are used are declared in lowercase - Calcite does 
not find needed objects to resolve and validate the view sql.

It does really not work even when end user creates connection with lex=oracle, 
but uses uppercase for the names of its tables. 

It would be best when MaterializedViewTable.MATERIALIZATION_CONNECTION would be 
replaced by connection of an end user or dynamically created connection with 
passed lex from end user connection. 

Quick and dirty solution is to create 
MaterializedViewTable.MATERIALIZATION_CONNECTION with caseSensitive=false;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-971) Add "NEXT n VALUES FOR sequence" expression

2018-03-08 Thread zhen wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391059#comment-16391059
 ] 

zhen wang commented on CALCITE-971:
---

this one is done, and could be closed per PHOENIX-2383

> Add "NEXT n VALUES FOR sequence" expression
> ---
>
> Key: CALCITE-971
> URL: https://issues.apache.org/jira/browse/CALCITE-971
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>  Labels: phoenix
>
> Add "NEXT n VALUES FOR sequence" expression. Allows more than one value to be 
> grabbed at a time.
> Note that this departs from standard SQL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-1862) StackOverflowException in RelMdUtil.estimateFilteredRows

2018-03-08 Thread zhen wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CALCITE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391001#comment-16391001
 ] 

zhen wang commented on CALCITE-1862:


some findings
the estimate row count method it self is actually okay. before planner 
optimization it can derive the correct row count

the problem lies in the optimization, as rules gets applied on the relation 
algebra, the plan becomes deeper and deeper, when the stack overflow, the 
algebra has grown to a depth > 100 which essentially lead to the stack 
overflow. 

haven't located which rule is actually leading to the growth of the relation 
algebra. 


> StackOverflowException in RelMdUtil.estimateFilteredRows
> 
>
> Key: CALCITE-1862
> URL: https://issues.apache.org/jira/browse/CALCITE-1862
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>
> The query
> {code}select *
> from (
>   select *
>   from (
> select cast(null as integer) as d
> from "scott".emp)
>   where d is null and d is null)
> where d is null;{code}
> gives
> {noformat}
> java.lang.StackOverflowError
> > at 
> > org.apache.calcite.adapter.clone.ArrayTable.getStatistic(ArrayTable.java:76)
> > at 
> > org.apache.calcite.prepare.RelOptTableImpl.getRowCount(RelOptTableImpl.java:224)
> > at 
> > org.apache.calcite.rel.core.TableScan.estimateRowCount(TableScan.java:75)
> > at 
> > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:206)
> > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > at 
> > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:236)
> > at 
> > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:71)
> > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > at 
> > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:236)
> > at 
> > org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:718)
> > at 
> > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123)
> > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > at 
> > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:236)
> > at 
> > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:71)
> > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
> > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
> > at 
> > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:236)
> > at 
> > org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:718){noformat}
> For a test case, add the query to misc.iq and run QuidemTest.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)