[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-01 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890784#comment-15890784
 ] 

Julian Hyde edited comment on SOLR-8593 at 3/1/17 6:49 PM:
---

[~risdenk], Regarding the Turkish locale issue. We have to explicitly pass 
user.timezone from maven into surefire (see 
[pom.xml|https://github.com/apache/calcite/blob/0372d23b847d4d145917dd786d1c9e3570cb8041/pom.xml#L733]),
 so I suspect we'd have to do the same with the locale. Can you log a Calcite 
case please? Even if we can't reproduce, I'd rather that we tracked it.


was (Author: julianhyde):
[~risdenk], Regarding the Turkish locale issue. We have to explicitly pass 
user.timezone from maven into surefire, so I suspect we'd have to do the same 
with the locale. Can you log a Calcite case please? Even if we can't reproduce, 
I'd rather that we tracked it.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-01 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890784#comment-15890784
 ] 

Julian Hyde commented on SOLR-8593:
---

[~risdenk], Regarding the Turkish locale issue. We have to explicitly pass 
user.timezone from maven into surefire, so I suspect we'd have to do the same 
with the locale. Can you log a Calcite case please? Even if we can't reproduce, 
I'd rather that we tracked it.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-27 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886312#comment-15886312
 ] 

Julian Hyde commented on SOLR-8593:
---

Oh wow. I18n never fails to surprise. Please log a Calcite issue. We should 
ensure that Calcite runs correctly if {{user.locale=tr}}.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-03 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851767#comment-15851767
 ] 

Julian Hyde commented on SOLR-8593:
---

This shouldn't be due to cost differences. The plan without the sort (limit) is 
incorrect, so should never be chosen, regardless of cost.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849344#comment-15849344
 ] 

Julian Hyde commented on SOLR-8593:
---

In the case where there is LIMIT but no ORDER BY, is a LogicalSort created? 
(There should be.) Is a SolrSort created, and its its offset field set (there 
should be)? If so, how/why does the SolrSort get dropped? (Does the planner 
find that it is equivalent to something cheaper? It shouldn't.)

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849145#comment-15849145
 ] 

Julian Hyde edited comment on SOLR-8593 at 2/2/17 12:05 AM:


Not sure I understand. The query {{select a from b limit 10}} will have a 
{{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will 
be translated to a {{SolrSort}} with similar attributes. The sort is trivial - 
that is, you don't need to do any work to sort on 0 fields - but you do need to 
apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but 
maybe convert into a {{SolrLimit}} if you have such a thing.

You may be wondering why we combine sort and limit into the same operator. But 
remember that relational data sets are inherently unordered, so we have to do 
them at the same time. Sort with an empty key has reasonable semantics, just as 
-- I hope you agree -- Aggregate with an empty key (e.g. {{select count\(\*\) 
from emp}}, which is equivalent to {{select count\(\*\) from emp group by ()}}) 
is a reasonable generalization of Aggregate.


was (Author: julianhyde):
Not sure I understand. The query {{select a from b limit 10}} will have a 
{{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will 
be translated to a {{SolrSort}} with similar attributes. The sort is trivial - 
that is, you don't need to do any work to sort on 0 fields - but you do need to 
apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but 
maybe convert into a {{SolrLimit}} if you have such a thing.

You may be wondering why we combine sort and limit into the same operator. But 
remember that relational data sets are inherently unordered, so we have to do 
them at the same time. Sort with an empty key has reasonable semantics, just as 
-- I hope you agree -- Aggregate with an empty key (e.g. {{select count(*) from 
emp}}, which is equivalent to {{select count(*) from emp group by ()}}) is a 
reasonable generalization of Aggregate.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849145#comment-15849145
 ] 

Julian Hyde commented on SOLR-8593:
---

Not sure I understand. The query {{select a from b limit 10}} will have a 
{{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will 
be translated to a {{SolrSort}} with similar attributes. The sort is trivial - 
that is, you don't need to do any work to sort on 0 fields - but you do need to 
apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but 
maybe convert into a {{SolrLimit}} if you have such a thing.

You may be wondering why we combine sort and limit into the same operator. But 
remember that relational data sets are inherently unordered, so we have to do 
them at the same time. Sort with an empty key has reasonable semantics, just as 
-- I hope you agree -- Aggregate with an empty key (e.g. {{select count(*) from 
emp}}, which is equivalent to {{select count(*) from emp group by ()}}) is a 
reasonable generalization of Aggregate.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-01-27 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843313#comment-15843313
 ] 

Julian Hyde commented on SOLR-8593:
---

If you have any "linking" issues with protobuf, you might check out HIVE-15708, 
which was caused because Hive used both avatica-core (which shades protobuf) 
and avatica (which does not).

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9893) EasyMock/Mockito no longer works with Java 9 b148+

2017-01-09 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812465#comment-15812465
 ] 

Julian Hyde commented on SOLR-9893:
---

[~thetaphi], Thanks for replying. I agree with your strategy. I've disabled our 
offending tests using Assume, and we can still claim that Avatica works on 
JDK9, albeit with less coverage.

I am concerned that the Mockito/Cglib community seem to think that JDK9 support 
== adding support for new JDK9 features. Whereas we just want the same old 
functionality to run on a JDK9 runtime. (We can't use JDK9 features until we 
drop support for JDK1.7 and JDK1.8.) I'll weigh in on 
https://github.com/cglib/cglib/issues/93 and until then I guess we'll have to 
be patient.

> EasyMock/Mockito no longer works with Java 9 b148+
> --
>
> Key: SOLR-9893
> URL: https://issues.apache.org/jira/browse/SOLR-9893
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: 6.x, master (7.0)
>Reporter: Uwe Schindler
>Priority: Blocker
>
> EasyMock does not work anymore with latest Java 9, because it uses cglib 
> behind that is trying to access a protected method inside the runtime using 
> setAccessible. This is no longer allowed by Java 9.
> Actually this is really stupid. Instead of forcefully making the protected 
> defineClass method available to the outside, it is much more correct to just 
> subclass ClassLoader (like the Lucene expressions module does).
> I tried updating to easymock/mockito, but all that does not work, approx 25 
> tests fail. The only way is to disable all Mocking tests in Java 9. The 
> underlying issue in cglib is still not solved, master's code is here: 
> https://github.com/cglib/cglib/blob/master/cglib/src/main/java/net/sf/cglib/core/ReflectUtils.java#L44-L62
> As we use an old stone-aged version of mockito (1.x), a fix is not expected 
> to happen, although cglib might fix this!
> What should we do? This stupid issue prevents us from testing Java 9 with 
> Solr completely! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9893) EasyMock/Mockito no longer works with Java 9 b148+

2017-01-07 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807982#comment-15807982
 ] 

Julian Hyde commented on SOLR-9893:
---

We are running into the same issue in Calcite/Avatica: CALCITE-1567. Do you 
know if there is a Mockito bug logged for this? Somewhere in 
https://github.com/cglib/cglib/issues/93 someone suggests that it is fixed in a 
later version of Mockito. If so I would like to upgrade to that version of 
Mockito.

> EasyMock/Mockito no longer works with Java 9 b148+
> --
>
> Key: SOLR-9893
> URL: https://issues.apache.org/jira/browse/SOLR-9893
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Affects Versions: 6.x, master (7.0)
>Reporter: Uwe Schindler
>Priority: Blocker
>
> EasyMock does not work anymore with latest Java 9, because it uses cglib 
> behind that is trying to access a protected method inside the runtime using 
> setAccessible. This is no longer allowed by Java 9.
> Actually this is really stupid. Instead of forcefully making the protected 
> defineClass method available to the outside, it is much more correct to just 
> subclass ClassLoader (like the Lucene expressions module does).
> I tried updating to easymock/mockito, but all that does not work, approx 25 
> tests fail. The only way is to disable all Mocking tests in Java 9. The 
> underlying issue in cglib is still not solved, master's code is here: 
> https://github.com/cglib/cglib/blob/master/cglib/src/main/java/net/sf/cglib/core/ReflectUtils.java#L44-L62
> As we use an old stone-aged version of mockito (1.x), a fix is not expected 
> to happen, although cglib might fix this!
> What should we do? This stupid issue prevents us from testing Java 9 with 
> Solr completely! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-20 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764709#comment-15764709
 ] 

Julian Hyde commented on SOLR-8593:
---

Yes, early January. I've logged CALCITE-1547 to track the release.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-16 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755661#comment-15755661
 ] 

Julian Hyde commented on SOLR-8593:
---

A list of GROUP BY fields would be fine. But it must be in a sub-class 
Aggregate. Everyone else who is using Aggregate wants "Aggregate([x, y])" to be 
identical to "Aggregate([y, x])".

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753387#comment-15753387
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/16/16 4:21 AM:
-

Would it be correct to say that you have a physical operator which is a 
combination of Aggregate and TopN? This physical operator would have a sorted 
list of grouping fields and also a parameter N (which affects the cost 
estimate). Maybe it's a sub-class of Aggregate with some extra fields. It could 
be created by a planner rule that matches a Sort (with limit) on top of an 
Aggregate and also looks at estimated cardinality of the fields in order to 
sort them.


was (Author: julianhyde):
Would it be correct to say that you have a physical operator which is a 
combination of Aggregate and TopN? This physical operator would have a sorted 
list of grouping fields and also a parameter N (which affects the cost 
estimate). Maybe it's a sub-class of Aggregate with some extra fields.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753387#comment-15753387
 ] 

Julian Hyde commented on SOLR-8593:
---

Would it be correct to say that you have a physical operator which is a 
combination of Aggregate and TopN? This physical operator would have a sorted 
list of grouping fields and also a parameter N (which affects the cost 
estimate). Maybe it's a sub-class of Aggregate with some extra fields.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:15 PM:
--

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {{0, 1}} represents {{a, b}} because that is the 
physical order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.


was (Author: julianhyde):
I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {{ \{0, 1\} }} represents {{ \{a, b\} }} because that 
is the physical order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:14 PM:
--

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {{ \{0, 1\} }} represents {{ \{a, b\} }} because that 
is the physical order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.


was (Author: julianhyde):
I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde commented on SOLR-8593:
---

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:13 PM:
--

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.


was (Author: julianhyde):
I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710183#comment-15710183
 ] 

Julian Hyde commented on SOLR-8593:
---

Calcite's operators are logical. A 'Filter' operator might turn into operator 
instances running on multiple nodes or threads, each processing a partition of 
the data.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-28 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702473#comment-15702473
 ] 

Julian Hyde commented on SOLR-8593:
---

Calcite is an algebra, not an executor. When if converts a HAVING clause to a 
SolrFilter you are more than welcome to run those filters in parallel. I 
suppose it would mean SolrAggregate producing parallel output streams.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-28 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702457#comment-15702457
 ] 

Julian Hyde commented on SOLR-8593:
---

Calcite rewrites {{SELECT DISTINCT ...}} to {{SELECT ... GROUP BY ...}}. So if 
you just deal with {{GROUP BY}} (i.e. Calcite's Aggregate operator) you should 
be fine.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-14 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664701#comment-15664701
 ] 

Julian Hyde commented on SOLR-8593:
---

CALCITE-1306 covers this. It's not standard SQL but could be enabled via an 
extension.

I disagree that "Solr will run this filter faster than Calcite". With query 
optimization, both queries will produce identical plans. This issue is not 
about performance. It is about syntactic sugar (not that there's anything wrong 
with that).

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-14 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664595#comment-15664595
 ] 

Julian Hyde commented on SOLR-8593:
---

You're making a mistake I see a lot of people making: trying to do complex 
semantic transformations on the AST (SqlNode). That's an anti-pattern, because 
SQL's complex rules for name-resolution make the AST very brittle. You should 
do those kinds of transformations on the relational algebra tree (RelNode).

In fact, Calcite will convert query into a {{Scan -> Filter -> Aggregate -> 
Filter -> Project}} logical plan (the first Filter is the WHERE clause, the 
second Filter is the HAVING clause), so I don't think you need to do any tricky 
processing looking for aliases.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-14 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664576#comment-15664576
 ] 

Julian Hyde edited comment on SOLR-8593 at 11/14/16 6:11 PM:
-

Regarding the alias for "count(\*)". I guess one approach is to extend Calcite 
to allow a pluggable alias derivation (it has to be pluggable because you can't 
please everyone). Another approach is to leave the aliases as they are but 
generate field names for the JSON result set. Note that if you call 
SqlNode.getParserPosition() on each item in the select clause it will tell you 
the start and end point of that expression in the original SQL string, so you 
can extract the "count(\*)" using that information.

I don't think the the following should be valid, but under your proposed change 
it would be:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(*)
  FROM emp
  GROUP BY deptno) AS t
WHERE t."count(*)" > 3
{code}

Note that "count(\*)" is not an expression; it is a reference to a "column" 
produced by the sub-query. In my opinion, using a textual expression is very 
confusing, and we should not do it. Derived alias of {{count(\*)}} should be 
something not easily guessable, which will encourage users to use an alias:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(*) AS c
  FROM emp
  GROUP BY deptno) AS t
WHERE t.c > 3
{code}


was (Author: julianhyde):
Regarding the alias for "count(*)". I guess one approach is to extend Calcite 
to allow a pluggable alias derivation (it has to be pluggable because you can't 
please everyone). Another approach is to leave the aliases as they are but 
generate field names for the JSON result set. Note that if you call 
SqlNode.getParserPosition() on each item in the select clause it will tell you 
the start and end point of that expression in the original SQL string, so you 
can extract the "count(*)" using that information.

I don't think the the following should be valid, but under your proposed change 
it would be:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(\*)
  FROM emp
  GROUP BY deptno) AS t
WHERE t."count(*)" > 3
{code}

Note that "count(\*)" is not an expression; it is a reference to a "column" 
produced by the sub-query. In my opinion, using a textual expression is very 
confusing, and we should not do it. Derived alias of {{count(\*)}} should be 
something not easily guessable, which will encourage users to use an alias:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(\*) AS c
  FROM emp
  GROUP BY deptno) AS t
WHERE t.c > 3
{code}

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-14 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664576#comment-15664576
 ] 

Julian Hyde commented on SOLR-8593:
---

Regarding the alias for "count(*)". I guess one approach is to extend Calcite 
to allow a pluggable alias derivation (it has to be pluggable because you can't 
please everyone). Another approach is to leave the aliases as they are but 
generate field names for the JSON result set. Note that if you call 
SqlNode.getParserPosition() on each item in the select clause it will tell you 
the start and end point of that expression in the original SQL string, so you 
can extract the "count(*)" using that information.

I don't think the the following should be valid, but under your proposed change 
it would be:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(\*)
  FROM emp
  GROUP BY deptno) AS t
WHERE t."count(*)" > 3
{code}

Note that "count(\*)" is not an expression; it is a reference to a "column" 
produced by the sub-query. In my opinion, using a textual expression is very 
confusing, and we should not do it. Derived alias of {{count(\*)}} should be 
something not easily guessable, which will encourage users to use an alias:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(\*) AS c
  FROM emp
  GROUP BY deptno) AS t
WHERE t.c > 3
{code}

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-13 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15662280#comment-15662280
 ] 

Julian Hyde commented on SOLR-8593:
---

"count(\*)" is not a good derived column name, because it contains 
non-alphanumeric characters and is therefore not a valid identifier unless you 
enclose it in double-quotes. Therefore Calcite generates an alias that is a 
valid identifier.

I believe quite a few other databases do this.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-11 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658025#comment-15658025
 ] 

Julian Hyde edited comment on SOLR-8593 at 11/11/16 8:01 PM:
-

By the way, when you're ready, add please Solr to the [powered by 
Calcite|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 
for details.


was (Author: julianhyde):
By the way, when you're ready, add please Solr to the [powered 
by|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 for 
details.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-11 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658025#comment-15658025
 ] 

Julian Hyde commented on SOLR-8593:
---

By the way, when you're ready, add please Solr to the [powered 
by|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 for 
details.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-10-28 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616986#comment-15616986
 ] 

Julian Hyde commented on SOLR-8593:
---

Ah, I think I see what's going on. You're using avatica-1.9-SNAPSHOT with 
calcite-1.10. calcite-1.10 requires avatica-1.8, so you should use that. (Or is 
there a good reason why you need avatica-1.9?)

By the way, avatica-1.9 is less than a week from release. calcite-1.11 is maybe 
a month to six weeks away. The exact compatibility issues you describe are 
covered in CALCITE-1270 (and see the PR attached to that case).

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-10-28 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15616963#comment-15616963
 ] 

Julian Hyde commented on SOLR-8593:
---

Is there a Calcite issue logged for the AbstractMethodError relating to 
CalciteConnectionProperty? I see [others are running into the same 
problem|http://stackoverflow.com/questions/39318653/create-a-streaming-example-with-calcite-using-csv]
 and I want to document the solution (or fix the bug in Calcite/Avatica if it 
is a bug).

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378687#comment-15378687
 ] 

Julian Hyde commented on SOLR-8593:
---

The trickiest thing about CoGroup is that it aggregates (i.e. groups together) 
rows without collapsing them. So you need to be able to represent a nested set 
of rows. If Solr's evaluator can't handle nested rows then CoGroup will be 
tricky.

If you already have join and aggregate I'd stick with them.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-13 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376266#comment-15376266
 ] 

Julian Hyde commented on SOLR-8593:
---

You should probably model your join and aggregate operators as sub-classes of 
Join and Aggregate that understand the "distribution" trait. If you are doing, 
say, "group by x" then you will need your input either to be singleton (i.e. 
only one input stream) or partitioned on x. Calcite will be able to ensure that 
the input is partitioned appropriately, either because it is stored in 
partitions, or by applying a shuffle/exchange. 

There is the regular Exchange operator that changes the distribution (i.e. 
re-partitions) and there is SortExchange that changes the distribution and also 
sorts within each partition. SortExchange models what the shuffle does in 
MapReduce.

After you have a plan like

{noformat}
MyJoin[left.a = right.b]
  Exchange[a]
MyAggregate
  Exchange
Scan[T1]
  Exchange[b]
Scan[T2]
{noformat}

you can turn into map-reduce by making the consumer of each Exchange into a 
reduce task, and the input to each Exchange a map task.

I asked [~ashutoshc] how he would generate Hive MapReduce plans in Calcite 
(most Hive plans these days are Tez) and he said you should consider writing a 
CoGroup operator (like the one in Pig). CoGroup is powerful enough to implement 
both join and aggregate, so it might save you some effort.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-11 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371912#comment-15371912
 ] 

Julian Hyde commented on SOLR-8593:
---

Hi everyone! I'm VP of Apache Calcite. I only just noticed this JIRA case. I am 
excited that you are considering using Calcite. Please let me know if I can 
help.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org