subject:"\[jira\] \[Comment Edited\] \(SOLR\-8593\) Integrate Apache Calcite into the SQLHandler"

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-06 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897978#comment-15897978
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/6/17 8:12 PM:
--

Ok, I added the assumption for the Turkish locale. I'm planning on resolving 
this ticket. If other issues come up with this the Apache Calcite integration 
we can open up a new issue.


was (Author: joel.bernstein):
Ok, I added the assumption for the Turkish locale. I'm planning on resolving 
this ticket. If other issue come up we can up a new issue.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-06 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897978#comment-15897978
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/6/17 8:12 PM:
--

Ok, I added the assumption for the Turkish locale. I'm planning on resolving 
this ticket. If other issues come up with the Apache Calcite integration we can 
open up a new issue.


was (Author: joel.bernstein):
Ok, I added the assumption for the Turkish locale. I'm planning on resolving 
this ticket. If other issues come up with this the Apache Calcite integration 
we can open up a new issue.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-04 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895795#comment-15895795
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/4/17 6:02 PM:
--

Ok, I left this ticket open until I fixed this. I'll tackle this shortly.


was (Author: joel.bernstein):
Ok, I left this ticket until I fixed this. I'll tackle this shortly.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-03 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894686#comment-15894686
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/3/17 4:41 PM:
--

This has now been backported for release in Solr 6.5.

The jira number was not on the first backport commit so I'll link to it here:
https://github.com/apache/lucene-solr/commit/3370dbed2e3e247a40012ab76aca059d640dfc80
This is a squash of the commits from master. Master still has the commit 
history from the feature branch. This squashed commit will much easier to 
revert if need be.


was (Author: joel.bernstein):
This has now been backported for release in Solr 6.5.

The jira number was not on the first backport commit so I'll link to it here:
https://github.com/apache/lucene-solr/commit/3370dbed2e3e247a40012ab76aca059d640dfc80
This is a squash of the commits from master. Master still has the commit 
history from the feature branch.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-01 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890784#comment-15890784
 ] 

Julian Hyde edited comment on SOLR-8593 at 3/1/17 6:49 PM:
---

[~risdenk], Regarding the Turkish locale issue. We have to explicitly pass 
user.timezone from maven into surefire (see 
[pom.xml|https://github.com/apache/calcite/blob/0372d23b847d4d145917dd786d1c9e3570cb8041/pom.xml#L733]),
 so I suspect we'd have to do the same with the locale. Can you log a Calcite 
case please? Even if we can't reproduce, I'd rather that we tracked it.


was (Author: julianhyde):
[~risdenk], Regarding the Turkish locale issue. We have to explicitly pass 
user.timezone from maven into surefire, so I suspect we'd have to do the same 
with the locale. Can you log a Calcite case please? Even if we can't reproduce, 
I'd rather that we tracked it.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-03-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890431#comment-15890431
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/1/17 3:56 PM:
--

The backport is turning out be trickier then I thought.

The reason is that jira/solr-8593 has lots of master commits in it that are not 
in branch_6x. So merging is not going to work.

Also getting an isolated list of the 80+ commits done in jira/solr-8593 for the 
Calcite integration is tricky because the original pull request is closed. 
Creating a new pull request against branch_6x includes all the master commits.

So I'm looking at ways to isolate all the commits to cherry-pick.


was (Author: joel.bernstein):
The backport is turning out be trickier then I thought.

The reason is that jira/solr-8593 has lost of master commits in it that are not 
in branch_6x. So merging is not going to work.

Also getting an isolated list of the 80+ commits done in jira/solr-8593 for the 
Calcite integration is tricky because the original pull request is closed. 
Creating a new pull request against branch_6x includes all the master commits.

So I'm looking at ways to isolate all the commits to cherry-pick.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-27 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886352#comment-15886352
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/27/17 7:19 PM:
---

Ok great, I will add this assumption to the SQL tests.


was (Author: joel.bernstein):
Ok, great I will add this assumption to the SQL tests.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-27 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886352#comment-15886352
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/27/17 7:19 PM:
---

Ok, great I will add this assumption to the SQL tests.


was (Author: joel.bernstein):
Ok, great I will this assumption to the SQL tests.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-27 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886263#comment-15886263
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/27/17 6:20 PM:
---

It appears that this issue is due to the Turkish locale.

https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/


I believe this is due to how Calcite is handling locales. [~risdenk], curious 
if you see the same thing, which is that this in the Calcite code.

[~steve_rowe], if it does turn out this is a Calcite issue I'll log an issue on 
the Calcite site and submit a patch. In the meantime is there a way to suppress 
the Turkish locale in the tests?





was (Author: joel.bernstein):
It appears that this issue is due to the Turkish locale.

https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/


I believe this is due to how Calcite is handling locales. [~risdenk], curious 
is you see the same thing, which is that this in the Calcite code.

[~steve_rowe], if it does turn out this is a Calcite issue I'll log an issue on 
the Calcite site and submit a patch. In the meantime is there a way to suppress 
the Turkish locale in the tests?




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-21 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876162#comment-15876162
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/21/17 3:38 PM:
---

I was thinking about merging  
https://github.com/apache/lucene-solr/tree/jira/solr-8593 into branch_6x rather 
then cherry picking from master. There is one commit that will need to be 
reverted because it's only valid in master,  but that should be fairly easy to 
do I think.



was (Author: joel.bernstein):
I was thinking about merging  
https://github.com/apache/lucene-solr/tree/jira/solr-8593 into branch_6x rather 
then cherry picking from master. There is one commit that will need to be 
reverted because it's only valid in master,  but that should fairly easy to do 
I think.


> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-08 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858327#comment-15858327
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/8/17 6:16 PM:
--

It was a bit of an odyssey but I was able to push down the HAVING clause. I 
pushed the commits out with my latest work to:
https://github.com/apache/lucene-solr/tree/jira/solr-8593


was (Author: joel.bernstein):
It was a bit of an odyssey but the I was able to push down the HAVING clause. I 
pushed the commits out with my latest work to:
https://github.com/apache/lucene-solr/tree/jira/solr-8593

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-08 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858327#comment-15858327
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/8/17 6:10 PM:
--

It was a bit of an odyssey but the I was able to push down the HAVING clause. I 
pushed the commits out with my latest work to:
https://github.com/apache/lucene-solr/tree/jira/solr-8593


was (Author: joel.bernstein):
It was a bit of an odyssey but the I was able to push down the HAVING clause. I 
pushed the commits out with my latest work:
https://github.com/apache/lucene-solr/tree/jira/solr-8593

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-08 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858327#comment-15858327
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/8/17 6:08 PM:
--

It was a bit of an odyssey but the I was able to push down the HAVING clause. I 
pushed the commits out with my latest work:
https://github.com/apache/lucene-solr/tree/jira/solr-8593


was (Author: joel.bernstein):
It was a bit of an odyssey but the I was able to push down the HAVING clause. I 
pushed the commits out to with my latest work:
https://github.com/apache/lucene-solr/tree/jira/solr-8593

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-06 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854454#comment-15854454
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/6/17 5:59 PM:
--

After adjusting the SolrFilterRule to allow filters on expressions, the having 
filter now shows up above the SolrAggregate:

SolrFilter
SolrAggregate
SolrFilter
SolrTableScan

The first SolrFilter listed is the Having clause and second SolrFilter is the 
WHERE clause.

The next step is to recognize the different types of filters and build a 
HavingStream for the having filter. I'll be working on plugging this in.





was (Author: joel.bernstein):
After adjusting the SolrFilterRule to allow filters on expressions, the having 
filter now shows up above the SolrAggregate:

SolrFilter
SolrAggregate
SolrFilter
SolrTableScan

The first SolrFilter listed is the Having clause and second SolrFilter is the 
WHERE clause.

The next step is to recognize the different types of filters and build a 
HavingStream for having filter. I'll be working on plugging this in.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-06 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854454#comment-15854454
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/6/17 5:58 PM:
--

After adjusting the SolrFilterRule to allow filters on expressions, the having 
filter now shows up above the SolrAggregate:

SolrFilter
SolrAggregate
SolrFilter
SolrTableScan

The first SolrFilter listed in the Having clause and second SolrFilter is WHERE 
clause.

The next step is to recognize the different types of filters and build a 
HavingStream for having filter. I'll be working on plugging this in.





was (Author: joel.bernstein):
After adjusting the SolrFilterRule to allow filters on expressions the having 
filter now shows up following the SolrAggregate:

SolrFilter
SolrAggregate
SolrFilter
SolrTableScan

The first SolrFilter listed in the Having clause and second SolrFilter is WHERE 
clause.

The next step is to recognize the different types of filters and build a 
HavingStream for having filter. I'll be working on plugging this in.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-06 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854454#comment-15854454
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/6/17 5:58 PM:
--

After adjusting the SolrFilterRule to allow filters on expressions, the having 
filter now shows up above the SolrAggregate:

SolrFilter
SolrAggregate
SolrFilter
SolrTableScan

The first SolrFilter listed is the Having clause and second SolrFilter is the 
WHERE clause.

The next step is to recognize the different types of filters and build a 
HavingStream for having filter. I'll be working on plugging this in.





was (Author: joel.bernstein):
After adjusting the SolrFilterRule to allow filters on expressions, the having 
filter now shows up above the SolrAggregate:

SolrFilter
SolrAggregate
SolrFilter
SolrTableScan

The first SolrFilter listed in the Having clause and second SolrFilter is WHERE 
clause.

The next step is to recognize the different types of filters and build a 
HavingStream for having filter. I'll be working on plugging this in.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849160#comment-15849160
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/2/17 12:16 AM:
---

Hi, [~julianhyde],

The exact problem I'm seeing is that the SolrSort is not included in the query 
plan unless an ORDER BY is used in the query.

With the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrSort
org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 


Without the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 



was (Author: joel.bernstein):
Hi, [~julianhyde],

The exact problem I'm seeing is that the SolrSort is not included in the query 
plan unless an ORDER BY is used in the query.

With the ORDER BY our tree looks like this:
org.apache.solr.handler.sql.SolrSort
org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 


Without the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 


> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849160#comment-15849160
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/2/17 12:17 AM:
---

Hi [~julianhyde],

The exact problem I'm seeing is that the SolrSort is not included in the query 
plan unless an ORDER BY is used in the query.

With the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrSort
org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 


Without the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 



was (Author: joel.bernstein):
Hi, [~julianhyde],

The exact problem I'm seeing is that the SolrSort is not included in the query 
plan unless an ORDER BY is used in the query.

With the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrSort
org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 


Without the ORDER BY our tree looks like this:

org.apache.solr.handler.sql.SolrProject
org.apache.solr.handler.sql.SolrTableScan 


> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849145#comment-15849145
 ] 

Julian Hyde edited comment on SOLR-8593 at 2/2/17 12:05 AM:


Not sure I understand. The query {{select a from b limit 10}} will have a 
{{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will 
be translated to a {{SolrSort}} with similar attributes. The sort is trivial - 
that is, you don't need to do any work to sort on 0 fields - but you do need to 
apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but 
maybe convert into a {{SolrLimit}} if you have such a thing.

You may be wondering why we combine sort and limit into the same operator. But 
remember that relational data sets are inherently unordered, so we have to do 
them at the same time. Sort with an empty key has reasonable semantics, just as 
-- I hope you agree -- Aggregate with an empty key (e.g. {{select count\(\*\) 
from emp}}, which is equivalent to {{select count\(\*\) from emp group by ()}}) 
is a reasonable generalization of Aggregate.


was (Author: julianhyde):
Not sure I understand. The query {{select a from b limit 10}} will have a 
{{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will 
be translated to a {{SolrSort}} with similar attributes. The sort is trivial - 
that is, you don't need to do any work to sort on 0 fields - but you do need to 
apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but 
maybe convert into a {{SolrLimit}} if you have such a thing.

You may be wondering why we combine sort and limit into the same operator. But 
remember that relational data sets are inherently unordered, so we have to do 
them at the same time. Sort with an empty key has reasonable semantics, just as 
-- I hope you agree -- Aggregate with an empty key (e.g. {{select count(*) from 
emp}}, which is equivalent to {{select count(*) from emp group by ()}}) is a 
reasonable generalization of Aggregate.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/1/17 8:16 PM:
--

We've run into a bug with pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

The issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?

Here is the code where create the rule:
https://github.com/apache/lucene-solr/blob/jira/solr-8593/solr/core/src/java/org/apache/solr/handler/sql/SolrRules.java#L184






was (Author: joel.bernstein):
We've run into a bug with pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

The issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/1/17 7:51 PM:
--

We've run into a bug with pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

The issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?




was (Author: joel.bernstein):
We've run into a bug with pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

This issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/1/17 7:50 PM:
--

We've run into a bug iwith pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

This issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?




was (Author: joel.bernstein):
We've run into a bug in with pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

This issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/1/17 7:50 PM:
--

We've run into a bug with pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

This issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?




was (Author: joel.bernstein):
We've run into a bug iwith pushing down of the SQL LIMIT clause. Currently we 
have code that handles the limit in the SolrSort class which extends 
org.apache.calcite.rel.core.Sort.

This issue is that the SolrSort rule is only executed if an ORDER BY clause is 
included. So if LIMIT is used without an ORDER BY our code does not see the 
LIMIT.

So limit works in this scenario:

*select a from b order by a limit 10*

but not this scenario:

*select a from b limit 10*

[~julianhyde], any ideas on how to resolve this?



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>  Components: Parallel SQL
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Fix For: 6.5, master (7.0)
>
> Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848522#comment-15848522
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/1/17 3:36 PM:
--

I pulled the most recent work on this branch and precommit is passing. I'll be 
doing manual testing today and also running the full test suite. It seems like 
we are getting very close to committing this. I haven't seen the protobuf issue 
that [~julianhyde] mentioned above. But I'll keep testing and see if anything 
pops up. 

[~risdenk], any preference on who does the merge/commit to master on this?


was (Author: joel.bernstein):
I pulled the most recent work on this branch and precommit is passing. I'll be 
doing manual testing today and also running the full test suite. It seems like 
we are getting very close to committing this. I haven't seen the protobuf issue 
that [~julianhyde] mentioned above. But I'll keep testing and see if anything 
pops up. 

[~risdenk], any preference on who does the commit on this?

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-02-01 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848522#comment-15848522
 ] 

Joel Bernstein edited comment on SOLR-8593 at 2/1/17 3:35 PM:
--

I pulled the most recent work on this branch and precommit is passing. I'll be 
doing manual testing today and also running the full test suite. It seems like 
we are getting very close to committing this. I haven't seen the protobuf issue 
that [~julianhyde] mentioned above. But I'll keep testing and see if anything 
pops up. 

[~risdenk], any preference on who does the commit on this?


was (Author: joel.bernstein):
I pulled the most recent work on this branch and precommit is passing. I'll be 
doing manual testing today and also running the full test suite. It seems like 
we getting very close to committing this. I haven't seen the issue that 
[~julianhyde] mentioned above. But I'll keep testing and see if anything pops 
up. 

[~risdenk], any preference on who does the commit on this?

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-01-03 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796107#comment-15796107
 ] 

Joel Bernstein edited comment on SOLR-8593 at 1/3/17 8:33 PM:
--

It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently 
planning to take the HavingStream from this branch and commit it with SOLR-8530 
for Solr 6.4

This is going to cause merge conflicts between jira/solr-8593  and master. As 
soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593  
to master and work out the conflicts. So Calcite will be in master at the very 
beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code 
in master and perform the backport to branch_6x when ready.


was (Author: joel.bernstein):
It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently 
planning to take the HavingStream from this branch and commit it with SOLR-8530 
for Solr 6.4

This is going to cause merge conflicts between jira/solr-8593  and master. As 
soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593  
to master and work out the conflicts. So Calcite will be in master at very 
beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code 
more in master and perform the backport to branch_6x when ready.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2017-01-03 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796107#comment-15796107
 ] 

Joel Bernstein edited comment on SOLR-8593 at 1/3/17 8:32 PM:
--

It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently 
planning to take the HavingStream from this branch and commit it with SOLR-8530 
for Solr 6.4

This is going to cause merge conflicts between jira/solr-8593  and master. As 
soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593  
to master and work out the conflicts. So Calcite will be in master at very 
beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code 
more in master and perform the backport to branch_6x when ready.


was (Author: joel.bernstein):
It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently 
planning to take the HavingStream from this branch an commit it with SOLR-8530 
for Solr 6.4

This is going to cause merge conflicts between jira/solr-8593  and master. As 
soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593  
to master and work out the conflicts. So Calcite will be in master at very 
beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code 
more in master and perform the backport to branch_6x when ready.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-23 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773344#comment-15773344
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/23/16 5:41 PM:


I think I have a handle now on how the *string* and *arithmetic* functions 
work. I was expecting them to work automatically and Calcite would perform the 
functions if we returned data in the fields.

I now believe that is not the case, because we've pushed down the *projection*. 
Because we've pushed down the projection we'll need to implement the arithmetic 
and string functions using the *SelectStream*. What Calcite provides in the 
project rule is access to the parse tree so we have enough information to 
implement the functions.

Since this ticket was mainly about getting parity with the current SQL 
functionality, I think it makes sense to tackle the string and arithmetic 
functions in a separate ticket. I will create that ticket.




was (Author: joel.bernstein):
I think I have a handle now on how the *string* and *arithmetic* functions 
work. I was expecting them to work automatically and Calcite would perform the 
functions if we returned data in the fields.

I now believe that is not the case, because we've pushed down the projection. 
Because we've pushed down the *projection* we'll need to implement the 
arithmetic and string functions using the *SelectStream*. What Calcite provides 
in the project rule is access to the parse tree so we have enough information 
to implement the functions.

Since this ticket was mainly about getting parity with the current SQL 
functionality, I think it makes sense to tackle the string and arithmetic 
functions in a separate ticket. I will create that ticket.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-23 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773344#comment-15773344
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/23/16 5:40 PM:


I think I have a handle now on how the *string* and *arithmetic* functions 
work. I was expecting them to work automatically and Calcite would perform the 
functions if we returned data in the fields.

I now believe that is not the case, because we've pushed down the projection. 
Because we've pushed down the *projection* we'll need to implement the 
arithmetic and string functions using the *SelectStream*. What Calcite provides 
in the project rule is access to the parse tree so we have enough information 
to implement the functions.

Since this ticket was mainly about getting parity with the current SQL 
functionality, I think it makes sense to tackle the string and arithmetic 
functions in a separate ticket. I will create that ticket.




was (Author: joel.bernstein):
I think I have a handle now on how the *string* and *arithmetic* functions 
work. I was expecting them to work automatically and Calcite would perform the 
functions if we returned data in the fields.

I now believe that is not the case, because we've pushed down the projection. 
Because we've pushed down the projection we'll need to implement the arithmetic 
and string functions using the SelectStream. What Calcite provides in the 
project rule is access to the parse tree so we have enough information to 
implement the functions.

Since this ticket was mainly about getting parity with the current SQL 
functionality, I think it makes sense to tackle the string and arithmetic 
functions in a separate ticket. I will create that ticket.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-23 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773344#comment-15773344
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/23/16 5:38 PM:


I think I have a handle now on how the *string* and *arithmetic* functions 
work. I was expecting them to work automatically and Calcite would perform the 
functions if we returned data in the fields.

I now believe that is not the case, because we've pushed down the projection. 
Because we've pushed down the projection we'll need to implement the arithmetic 
and string functions using the SelectStream. What Calcite provides in the 
project rule is access to the parse tree so we have enough information to 
implement the functions.

Since this ticket was mainly about getting parity with the current SQL 
functionality, I think it makes sense to tackle the string and arithmetic 
functions in a separate ticket. I will create that ticket.




was (Author: joel.bernstein):
I think I have a handle now on how the *string* and *arithmetic* functions 
work. I was expecting them to work automatically and Calcite would perform the 
functions if we returned the fields.

I now believe that is not the case, because we've pushed down the projection. 
Because we've pushed down the projection we'll need to implement the arithmetic 
and string functions using the SelectStream. What Calcite provides in the 
project rule is access to the parse tree so we have enough information to 
implement the functions.

Since this ticket was mainly about getting parity with the current SQL 
functionality, I think it makes sense to tackle the string and arithmetic 
functions in a separate ticket. I will create that ticket.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-22 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15770997#comment-15770997
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/22/16 8:18 PM:


[~risdenk], I see you've added some special logic to get the *cast* function 
working. It's not clear to me though what the code is actually doing. Can you 
explain what's happening in the SolrRules.RexToSolrTranslator class with the 
cast function?

If possible I'd like to hook up the other string and arithmetic functions.


was (Author: joel.bernstein):
[~risdenk], I see you've added some special logic to get the *cast* function 
working. It's not clear to though exactly what's doing. Can you explain what's 
happening in the SolrRules.RexToSolrTranslator class with the cast function?

If possible I'd like to hook up the other string and arithmetic functions.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15761872#comment-15761872
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/19/16 6:13 PM:


It wasn't clear to me that we were using Avatica yet?  It seemed we were using 
org.apache.calcite imports. Are there dependencies within Caclite core on 
Avatica?


was (Author: joel.bernstein):
It wasn't clear to me that we were using Avatica yet?  It seemed we were using 
org.apache.calcite imports. Are their dependencies within Caclite core on 
Avatica?

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-19 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15761834#comment-15761834
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/19/16 5:57 PM:


 Are we blocked until Calcite 0.11.0 releases or is their president for using a 
snapshot release?

With the protobuf upgrade I think as long all tests are passing it should be ok 
to upgrade. 


was (Author: joel.bernstein):
 Are blocked until Calcite 0.11.0 releases or is there president for using a 
snapshot release?

With the protobuf upgrade I think as long all tests are passing it should be ok 
to upgrade. 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753387#comment-15753387
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/16/16 4:21 AM:
-

Would it be correct to say that you have a physical operator which is a 
combination of Aggregate and TopN? This physical operator would have a sorted 
list of grouping fields and also a parameter N (which affects the cost 
estimate). Maybe it's a sub-class of Aggregate with some extra fields. It could 
be created by a planner rule that matches a Sort (with limit) on top of an 
Aggregate and also looks at estimated cardinality of the fields in order to 
sort them.


was (Author: julianhyde):
Would it be correct to say that you have a physical operator which is a 
combination of Aggregate and TopN? This physical operator would have a sorted 
list of grouping fields and also a parameter N (which affects the cost 
estimate). Maybe it's a sub-class of Aggregate with some extra fields.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:15 PM:
--

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {{0, 1}} represents {{a, b}} because that is the 
physical order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.


was (Author: julianhyde):
I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {{ \{0, 1\} }} represents {{ \{a, b\} }} because that 
is the physical order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:14 PM:
--

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {{ \{0, 1\} }} represents {{ \{a, b\} }} because that 
is the physical order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.


was (Author: julianhyde):
I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805
 ] 

Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:13 PM:
--

I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.


was (Author: julianhyde):
I wasn't familiar with faceting, but I quickly read 
https://wiki.apache.org/solr/SolrFacetingOverview.

Suppose table T has fields a, b, c, d, and you want to do a faceted search on 
b, a. If you issue the query {{select b, a, count(*) from t group by b, a}} 
then you will end up with

{code}
Project($1, $0, $2)
  Aggregate({0, 1}, COUNT(*))
Scan(table=T)
{code}

and as you correctly say, {0, 1} represents {a, b} because that is the physical 
order of the columns.

Can you explain why the faceting algorithm is interested in the order of the 
columns? Is it because it needs to produce the output ordered or nested on 
those columns? If so, we can rephrase the SQL query so that we are accurately 
expressing in relational algebra what we need.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752570#comment-15752570
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/15/16 9:33 PM:


I just pushed out a commit to the solr/jira-8593 branch:

https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f

This is a pretty large refactoring of the SolrTable class which includes 
implementations for aggregationMode=facet for both GROUP BY aggregations and 
SELECT DISTINCT aggregations.

All tests in TestSQLHandler are passing.

There is only one thing that I'm not quite happy about in this patch which is 
specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the 
issue. The specific issue deals with how the set of GROUP BY fields are handled 
in Calcite. From what I can see there isn't an easy way to get the ordering of 
the GROUP BY fields preserved from the query. The Solr faceting implementations 
requires the correct order of the GROUP BY fields to return a correct response. 
So, I'm getting the ordering from the field list of the query instead. This may 
actually be the correct approach from a SQL standpoint but I was wondering what 
Julian thought about this issue.


 


was (Author: joel.bernstein):
I just pushed out a commit to the solr/jira-8593 branch:

https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f

This is a pretty large refactoring of the SolrTable class which includes 
implementations for aggregationMode=facet for both GROUP BY aggregations and 
SELECT DISTINCT aggregations.

All tests in TestSQLHandler are passing.

There is only one thing that I'm not quite happy about in this patch which is 
specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the 
issue. The specific issue deals with how the set of GROUP BY fields is dealt 
with in Calcite. From what I can see there isn't an easy way to get the 
ordering of the GROUP BY fields preserved from the query. The Solr faceting 
implementations requires the correct order of the GROUP BY fields to return a 
correct response. So, I'm getting the ordering from the field list of the query 
instead. This may actually be the correct approach from a SQL standpoint but I 
was wondering what Julian thought about this issue.


 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752570#comment-15752570
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/15/16 9:32 PM:


I just pushed out a commit to the solr/jira-8593 branch:

https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f

This is a pretty large refactoring of the SolrTable class which includes 
implementations for aggregationMode=facet for both GROUP BY aggregations and 
SELECT DISTINCT aggregations.

All test in the TestSQLHandler are passing.

There is only one thing that I'm not quite happy about in this patch which is 
specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the 
issue. The specific issue deals with how the set of GROUP BY fields is dealt 
with in Calcite. From what I can see there isn't an easy way to get the 
ordering of the GROUP BY fields preserved from the query. The Solr faceting 
implementations requires the correct order of the GROUP BY fields to return a 
correct response. So, I'm getting the ordering from the field list of the query 
instead. This may actually be the correct approach from a SQL standpoint but I 
was wondering what Julian thought about this issue.


 


was (Author: joel.bernstein):
I just pushed out a commit to the solr/jira-8593 branch:

https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f

This is a pretty large refactoring of the SolrTable class which includes 
implementations for aggregationMode=facet for both GROUP BY aggregations and 
SELECT DISTINCT aggregations.

All test is TestSQLHandler are passing.

There is only one thing that I'm not quite happy about in this patch which is 
specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the 
issue. The specific issue deals with how the set of GROUP BY fields is dealt 
with in Calcite. From what I can see there isn't an easy way to get the 
ordering of the GROUP BY fields preserved from the query. The Solr faceting 
implementations requires the correct order of the GROUP BY fields to return a 
correct response. So, I'm getting the ordering from the field list of the query 
instead. This may actually be the correct approach from a SQL standpoint but I 
was wondering what Julian thought about this issue.


 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-15 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752570#comment-15752570
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/15/16 9:32 PM:


I just pushed out a commit to the solr/jira-8593 branch:

https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f

This is a pretty large refactoring of the SolrTable class which includes 
implementations for aggregationMode=facet for both GROUP BY aggregations and 
SELECT DISTINCT aggregations.

All tests in TestSQLHandler are passing.

There is only one thing that I'm not quite happy about in this patch which is 
specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the 
issue. The specific issue deals with how the set of GROUP BY fields is dealt 
with in Calcite. From what I can see there isn't an easy way to get the 
ordering of the GROUP BY fields preserved from the query. The Solr faceting 
implementations requires the correct order of the GROUP BY fields to return a 
correct response. So, I'm getting the ordering from the field list of the query 
instead. This may actually be the correct approach from a SQL standpoint but I 
was wondering what Julian thought about this issue.


 


was (Author: joel.bernstein):
I just pushed out a commit to the solr/jira-8593 branch:

https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f

This is a pretty large refactoring of the SolrTable class which includes 
implementations for aggregationMode=facet for both GROUP BY aggregations and 
SELECT DISTINCT aggregations.

All test in the TestSQLHandler are passing.

There is only one thing that I'm not quite happy about in this patch which is 
specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the 
issue. The specific issue deals with how the set of GROUP BY fields is dealt 
with in Calcite. From what I can see there isn't an easy way to get the 
ordering of the GROUP BY fields preserved from the query. The Solr faceting 
implementations requires the correct order of the GROUP BY fields to return a 
correct response. So, I'm getting the ordering from the field list of the query 
instead. This may actually be the correct approach from a SQL standpoint but I 
was wondering what Julian thought about this issue.


 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:17 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is to implement the aggregationMode=facet logic for 
aggregate queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigating how SELECT DISTINCT behaves with our 
Calcite implementation. As [~julianhyde] mentioned, we should see DISTINCT 
queries as aggregate queries so it's possible we'll have all the code in place 
to push this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is to implement the aggregationMode=facet logic for 
aggregate queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:16 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is to implement the aggregationMode=facet logic for 
aggregate queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some of the logic for 
how the query is re-written to a Lucene/Solr query so that it matches the 
current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-12-05 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648
 ] 

Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM:
---

I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release and the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current patch doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.




was (Author: joel.bernstein):
I wanted to give an update on my work on this ticket.

I've started working my way through the test cases (TestSQLHandler). I'm 
working through each assertion in each method to understand the differences 
between the current release the work done in this patch, and making 
changes/fixes as I go.

The first change that I made was in how the predicate is being traversed. The 
current pant doesn't descend through a full nested AND/OR predicate. So I made 
a few changes to how the tree is walked. I also changed some how the query is 
re-written to a Lucene/Solr query so that it matches the current implementation.

I've now moved on to aggregate queries. I've been investigating the use of 
EXPR$1 ... instead of using the *function signature* in the result set. It 
looks like we'll have to use the Caclite expression identifiers going forward, 
which should be OK. I think this is cleaner anyway because looking up fields by 
a function signature can get cumbersome. We'll just need to document this in 
the CHANGES.txt.

The next step for me is implement the aggregationMode=facet logic for aggregate 
queries. After that I'll push out my changes to this branch. 

Then I'll spend some time investigation how SELECT distinct behaves in our 
implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as 
aggregate queries so it's possible we'll have all the code in place to push 
this to Solr already.



> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch, SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:38 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

*having*(streamExpr, *and*(*eq*(field1, value1), *not*(*eq*(field2, value2

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.





was (Author: joel.bernstein):
One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

*having*(streamExpr, *and*(*eq*(field1, value1), *eq*(field2, value2)))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:37 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

*having*(streamExpr, *and*(*eq*(field1, value1), *eq*(field2, value2)))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.





was (Author: joel.bernstein):
One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(eq(field1, value1), eq(field2, value2)))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:36 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(eq(field1, value1), eq(field2, value2)))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.





was (Author: joel.bernstein):
One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(eq(fieldName, value))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:34 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(eq(fieldName, value))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.





was (Author: joel.bernstein):
One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(equals(fieldName, value))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:33 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My plan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(equals(fieldName, value))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.





was (Author: joel.bernstein):
One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My pan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(equals(fieldName, value))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:32 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

My pan is to implement the following classes for the having logic:

HavingStream  
BooleanOperation
AndOperation
OrOperation
NotOperation
EqualsOperation
LessThenOperation
GreaterThenOperation

Syntax:

having(streamExpr, and(equals(fieldName, value))

The having function will read the Tuples from the streamExpr and apply the 
boolean operation to each Tuple.

If the boolean operation returns true the having stream will emit the Tuple.





was (Author: joel.bernstein):
One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:24 PM:


One of things that is also not specifically handled is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.


was (Author: joel.bernstein):
One of things that is also not specifically handing is the HAVING clause. 

I think we should push down this capability to Solr as well so we can perform 
the HAVING logic on the worker nodes. In high cardinality use cases this will 
be a big performance improvement.

We also need to develop a HavingStream to manage the having logic. I'll start 
the work for the HavingStream in this branch as it directly supports the 
Calcite integration.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-30 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708937#comment-15708937
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/30/16 3:58 PM:


I've started to work on this ticket. As a first step I'm doing some refactoring 
on the SolrTable class to create methods for handling the different types of 
queries. After that I'll get the aggregationModes hooked up.


was (Author: joel.bernstein):
I've started to work on this ticket. As a first step I'm doing some refactoring 
to create methods for handling the different types of queries. After that I'll 
get the aggregationModes hooked up.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-27 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700686#comment-15700686
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/28/16 2:34 AM:


Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket 
and it's looking really good!

A couple pieces of functionality that appear to be missing:

1) Specific handling of SELECT DISTINCT queries. In the current SQLHandler we 
can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we 
can also push down the distinct logic to the JSON Facet API.

2) The pushing down of GROUP BY aggregations to the JSON Facet API.

Both of these currently require the aggregationMode parameter to be passed in 
with the query, which I think is fine for the initial Calcite release.

I'd be happy to add these capabilities to this branch. That will also give me 
an opportunity to work with the code and feel comfortable working with Calcite.


 


was (Author: joel.bernstein):
Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket 
and it's looking really good!

A couple pieces of functionality that appear to be missing are:

1) Specific handling of SELECT DISTINCT queries. In the current SQLHandler we 
can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we 
can also push down the distinct logic to the JSON Facet API.

2) The pushing down of GROUP BY aggregations to the JSON Facet API.

Both of these currently require the aggregationMode parameter to be passed in 
with the query, which I think is fine for the initial Calcite release.

I'd be happy to add these capabilities to this branch. That will also give me 
an opportunity to work with the code and feel comfortable working with Calcite.


 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-27 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700686#comment-15700686
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/28/16 2:07 AM:


Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket 
and it's looking really good!

A couple pieces of functionality that appear to be missing are:

1) Specific handling of SELECT DISTINCT queries. In the current SQLHandler we 
can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we 
can also push down the distinct logic to the JSON Facet API.

2) The pushing down of GROUP BY aggregations to the JSON Facet API.

Both of these currently require the aggregationMode parameter to be passed in 
with the query, which I think is fine for the initial Calcite release.

I'd be happy to add these capabilities to this branch. That will also give me 
an opportunity to work with the code and feel comfortable working with Calcite.


 


was (Author: joel.bernstein):
Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket 
and it's looking really good!

A couple pieces of functionality that appear to be missing are:

1) Handling of SELECT DISTINCT queries. In the current SQLHandler we can do 
MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we can also 
push down the distinct logic to the JSON Facet API.

2) The pushing down of GROUP BY aggregations to the JSON Facet API.

Both of these currently require the aggregationMode parameter to be passed in 
with the query, which I think is fine for the initial Calcite release.

I'd be happy to add these capabilities to this branch. That will also give me 
an opportunity to work with the code and feel comfortable working with Calcite.


 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-14 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664613#comment-15664613
 ] 

Cao Manh Dat edited comment on SOLR-8593 at 11/14/16 6:30 PM:
--

But we wanna to handle having clause without the function like ( Because Solr 
will run this filter faster than Calcite )
{code}
having field_i = 19
{code}
and left the other cases for Calcite to handle.

Is there any better ways to do this kind of filter?


was (Author: caomanhdat):
But we wanna to handle having clause without the function like
{code}
having field_i = 19
{code}
Is there any better ways to do this kind of filter?

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-14 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664576#comment-15664576
 ] 

Julian Hyde edited comment on SOLR-8593 at 11/14/16 6:11 PM:
-

Regarding the alias for "count(\*)". I guess one approach is to extend Calcite 
to allow a pluggable alias derivation (it has to be pluggable because you can't 
please everyone). Another approach is to leave the aliases as they are but 
generate field names for the JSON result set. Note that if you call 
SqlNode.getParserPosition() on each item in the select clause it will tell you 
the start and end point of that expression in the original SQL string, so you 
can extract the "count(\*)" using that information.

I don't think the the following should be valid, but under your proposed change 
it would be:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(*)
  FROM emp
  GROUP BY deptno) AS t
WHERE t."count(*)" > 3
{code}

Note that "count(\*)" is not an expression; it is a reference to a "column" 
produced by the sub-query. In my opinion, using a textual expression is very 
confusing, and we should not do it. Derived alias of {{count(\*)}} should be 
something not easily guessable, which will encourage users to use an alias:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(*) AS c
  FROM emp
  GROUP BY deptno) AS t
WHERE t.c > 3
{code}


was (Author: julianhyde):
Regarding the alias for "count(*)". I guess one approach is to extend Calcite 
to allow a pluggable alias derivation (it has to be pluggable because you can't 
please everyone). Another approach is to leave the aliases as they are but 
generate field names for the JSON result set. Note that if you call 
SqlNode.getParserPosition() on each item in the select clause it will tell you 
the start and end point of that expression in the original SQL string, so you 
can extract the "count(*)" using that information.

I don't think the the following should be valid, but under your proposed change 
it would be:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(\*)
  FROM emp
  GROUP BY deptno) AS t
WHERE t."count(*)" > 3
{code}

Note that "count(\*)" is not an expression; it is a reference to a "column" 
produced by the sub-query. In my opinion, using a textual expression is very 
confusing, and we should not do it. Derived alias of {{count(\*)}} should be 
something not easily guessable, which will encourage users to use an alias:

{code}
SELECT deptno
FROM (
  SELECT deptno, count(\*) AS c
  FROM emp
  GROUP BY deptno) AS t
WHERE t.c > 3
{code}

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-12 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15661016#comment-15661016
 ] 

Cao Manh Dat edited comment on SOLR-8593 at 11/13/16 7:12 AM:
--

Thanks [~risdenk]
[~julianhyde] BTW: How can we get rid of EXPR$1 return in 
{code}
select str_s, count(*) from collection1
{code}
the result set return the name of second column as {{EXPR$1}}, not {{count( * 
)}} as expected


was (Author: caomanhdat):
Thanks [~risdenk]
[~julianhyde] BTW: How can we get rid of EXPR$1 return in 
{code}
select str_s, count(*) from collection1
{code}
the result set return the name of second column as {{EXPR$1}}, {{not count( * 
)}} as expected

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-12 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15661016#comment-15661016
 ] 

Cao Manh Dat edited comment on SOLR-8593 at 11/13/16 7:11 AM:
--

Thanks [~risdenk]
[~julianhyde] BTW: How can we get rid of EXPR$1 return in 
{code}
select str_s, count(*) from collection1
{code}
the result set return the name of second column as {{EXPR$1}}, {{not count( * 
)}} as expected


was (Author: caomanhdat):
Thanks [~risdenk]
[~julianhyde] BTW: How can we get rid of EXPR$1 return in 
{code}
select str_s, count(*) from collection1
{code}
the result set return the name of second column as {{EXPR$1}}, {{not count(*)}} 
as expected

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-12 Thread Cao Manh Dat (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15659361#comment-15659361
 ] 

Cao Manh Dat edited comment on SOLR-8593 at 11/12/16 9:25 AM:
--

Patch based on pull/104. Fixed most of the failed tests ( some tests is 
modified a little bit to more accurate ).
There are something we have to do, but it pretty close now.


was (Author: caomanhdat):
Patch based on pull/104. Fixed most of the failed tests ( some tests is 
modified a little bit to more accurate ).

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
> Attachments: SOLR-8593.patch
>
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-11 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658025#comment-15658025
 ] 

Julian Hyde edited comment on SOLR-8593 at 11/11/16 8:01 PM:
-

By the way, when you're ready, add please Solr to the [powered by 
Calcite|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 
for details.


was (Author: julianhyde):
By the way, when you're ready, add please Solr to the [powered 
by|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 for 
details.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-11-09 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652093#comment-15652093
 ] 

Joel Bernstein edited comment on SOLR-8593 at 11/9/16 9:32 PM:
---

Very excited about this patch. After a day of review I think understand how 
this comes together. Next step is for me to get it running. Initial run of the 
test cases fail, probably due to the classpath issue that [~risdenk] mentions. 
I'll work on getting it running as the next step and compare the feature set 
with the current release.


was (Author: joel.bernstein):
Very excited about this patch. After a day of review I think understand how 
this comes together. Next step is for me to get it running. Initial run of the 
test cases fail, probably due to classpath issue that [~risdenk] mentions. I'll 
work on getting it running as the next step and compare the feature set with 
the current release.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>
>The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-10-04 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513607#comment-15513607
 ] 

Kevin Risden edited comment on SOLR-8593 at 10/4/16 6:31 PM:
-

Adding some resources that may be helpful:
* http://www.slideshare.net/HadoopSummit/costbased-query-optimization
* 
https://medium.com/@mpathirage/query-planning-with-apache-calcite-part-1-fe957b011c36#.ywd9ouxmv
* http://www.slideshare.net/JordanHalterman/introduction-to-apache-calcite


was (Author: risdenk):
Adding some resources that may be helpful:
* http://www.slideshare.net/HadoopSummit/costbased-query-optimization
* 
https://medium.com/@mpathirage/query-planning-with-apache-calcite-part-1-fe957b011c36#.ywd9ouxmv

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378775#comment-15378775
 ] 

Joel Bernstein edited comment on SOLR-8593 at 7/15/16 2:53 AM:
---

Ah, I see. I think we can pull off CoGroup with a merge() and reduce().

{code}
reduce(merge(search(..., sort="a asc"),
 search(..., sort="a asc"),
 on="a asc")), 
   by="a", 
   group(sort="b asc", n="5")) 
{code}

This operation is more generic in that it could be an aggregation or a join 
depending on what comes after the reduce. 

Fun stuff.


was (Author: joel.bernstein):
Ah, I see. I think we can pull off CoGroup with a merge() and reduce().

{code}
reduce(merge(search(..., sort="a asc"),
 search(..., sort="a asc"),
 on="a asc")), 
   by="a", 
   group(sort="b asc", n="5")) 
{code}

This operation is more generic in that it could be an aggregation or a join 
depending what comes after the reduce. 

Fun stuff.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378775#comment-15378775
 ] 

Joel Bernstein edited comment on SOLR-8593 at 7/15/16 2:52 AM:
---

Ah, I see. I think we can pull off CoGroup with a merge() and reduce().

{code}
reduce(merge(search(..., sort="a asc"),
 search(..., sort="a asc"),
 on="a asc")), 
   by="a", 
   group(sort="b asc", n="5")) 
{code}

This operation is more generic in that it could be an aggregation or a join 
depending what comes after the reduce. 

Fun stuff.


was (Author: joel.bernstein):
Ah, I see. I think we can pull off CoGroup with a merge() and reduce().

{code}
reduce(merge(search(..., sort="a asc"),
  search(..., sort="a asc"),
  on="a asc")), 
   by="a", 
   group(sort="b asc", n="5")) 
{code}

This operation is more generic in that it could be an aggregation or a join 
depending what comes after the reduce. 

Fun stuff.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665
 ] 

Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:33 AM:


The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
over="y",
innerJoin(on="a=b",
  search(...), 
  search(...))) 
{code}


Our basic strategy is to build up a Streaming Expression that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.


was (Author: joel.bernstein):
The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
over="y",
innerJoin(on="a=b",
  search(...), 
  search(...))) 
{code}


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665
 ] 

Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:31 AM:


The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
over="y",
innerJoin(on="a=b",
  search(...), 
  search(...))) 
{code}


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.


was (Author: joel.bernstein):
The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
   over="y",
   innerJoin(on="a=b",
   search(...), 
   search(...))) 
{code}


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665
 ] 

Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:31 AM:


The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
   over="y",
   innerJoin(on="a=b",
   search(...), 
   search(...))) 
{code}


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.


was (Author: joel.bernstein):
The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
   over="y",
   innerJoin(on="a=b",
   search(...), 
   search(...))) 
{code}


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-07-14 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665
 ] 

Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:30 AM:


The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
   over="y",
   innerJoin(on="a=b",
   search(...), 
   search(...))) 
{code}


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.


was (Author: joel.bernstein):
The CoGroup operator is interesting. Solr's Streaming Expressions provide some 
similar operations 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). 
CoGroup seems similar to a rollup() function wrapped around a join in Streaming 
Exrpressions. Psuedo-code:
{code}
 rollup(sum(x),
   over="y",
   innerJoin(on="a=b",
   search(...), 
   search(...))) 
{code)


Our basic strategy is to build up a Streaming Expressions that implements 
Calcites relational algebra. We've done this already with the Presto parser, 
but without joins.  

Looking forward to diving deeper into Calcite.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-05-09 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276476#comment-15276476
 ] 

Joel Bernstein edited comment on SOLR-8593 at 5/9/16 3:09 PM:
--

I think my main concern is that the join pushdown rules haven't been exercised 
that much in production. Also do we have access to the SQL catalog and 
statistics from inside the rules engine. We'll need that information to decide 
which type of join to do. I guess we can make the catalog globally accessible 
if the rules API doesn't provide hooks into it.


was (Author: joel.bernstein):
I think my main concern is that the join pushdown rules haven't been exercised 
that much in production.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-05-09 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276420#comment-15276420
 ] 

Joel Bernstein edited comment on SOLR-8593 at 5/9/16 2:25 PM:
--

[~risdenk], I've been reviewing the work on 
https://github.com/risdenk/solr-calcite-example.

I think I understand the basics of how this works but there are some mysteries 
as to how exactly the rules get triggered and where to place rules like join 
push downs.

Do you know of an existing adapter that does join push downs that can be used 
as a reference implementation?

And what's the best source of documentation for implementing rules? I haven't 
found anything that I would describes as comprehensive.


was (Author: joel.bernstein):
[~risdenk], I've been reviewing the work on 
https://github.com/risdenk/solr-calcite-example.

I think I understand the basics of how this works but there are some mysteries 
as to how exactly the rules get triggered and where to place rules like join 
push downs.

Do you know of an existing adapter that does join push downs that can be used a 
reference implementation?

And what's the best source of documentation for implementing rules? I haven't 
found anything that I would describes as comprehensive.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-04-29 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895
 ] 

Kevin Risden edited comment on SOLR-8593 at 4/29/16 2:28 PM:
-

Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* -figure out avg(int) problem in tests.-
** -avg(int) returns int by design. need to figure out if casting is right for 
the tests-
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.
* handle added dependencies properly -and upgrade to latest Calcite/Avatica-


was (Author: risdenk):
Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* figure out avg(int) problem in tests.
** avg(int) returns int by design. need to figure out if casting is right for 
the tests
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.
* handle added dependencies properly -and upgrade to latest Calcite/Avatica-

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-04-29 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264057#comment-15264057
 ] 

Joel Bernstein edited comment on SOLR-8593 at 4/29/16 1:38 PM:
---

This is awesome.

I like the idea of making this a 6.2 priority. I should have some time over the 
next couple of days to dig into the implementation.


was (Author: joel.bernstein):
This is awesome.

I like the idea of making this a 6.2 priority. I should have some time over the 
next couple of days dig into the implementation.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-04-28 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895
 ] 

Kevin Risden edited comment on SOLR-8593 at 4/29/16 1:23 AM:
-

Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* figure out avg(int) problem in tests.
** avg(int) returns int by design. need to figure out if casting is right for 
the tests
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.
* handle added dependencies properly -and upgrade to latest Calcite/Avatica-


was (Author: risdenk):
Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* figure out avg(int) problem in tests.
** avg(int) returns int by design. need to figure out if casting is right for 
the tests
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.
* handle added dependencies properly and upgrade to latest Calcite/Avatica?

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-04-28 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895
 ] 

Kevin Risden edited comment on SOLR-8593 at 4/28/16 8:20 PM:
-

Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* figure out avg(int) problem in tests.
** avg(int) returns int by design. need to figure out if casting is right for 
the tests
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.
* handle added dependencies properly and upgrade to latest Calcite/Avatica?


was (Author: risdenk):
Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* figure out avg(int) problem in tests.
** avg(int) returns int by design. need to figure out if casting is right for 
the tests
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-04-28 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895
 ] 

Kevin Risden edited comment on SOLR-8593 at 4/28/16 8:15 PM:
-

Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for aggregationMode (facets and map_reduce) and their parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite
* figure out avg(int) problem in tests.
** avg(int) returns int by design. need to figure out if casting is right for 
the tests
* figure out sort asc by default in tests
** this currently doesn't sort properly even though I thought that the right 
approach was sort on _version_.


was (Author: risdenk):
Ok made a bunch of progress on the jira/solr-8593 branch:
* cleaned up the tests so that there are only a few remaining items to address 
(outlined below)
* added support for float/double types
* fixed a CloudSolrClient resource leak

Left to do:
* Add support for facets and map_reduce as parameters
* ensure the pushdown to facets/map_reduce works correctly
* figure out the CloudSolrClient cache (currently not caching and creating new 
per stream)
* Push down aggregates to Solr
* add tests to ensure the proper plan is being generated by Calcite

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-04-28 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262361#comment-15262361
 ] 

Kevin Risden edited comment on SOLR-8593 at 4/28/16 3:46 PM:
-

Pushed initial version to branch jira/solr-8593. Currently there are a lot of 
tests in TestSQLHandler that are commented out. Need to go back through and 
make sure they have the expected results. Also some file formatting related to 
headers needs to be addressed.


was (Author: risdenk):
Pushed initial version to branch jira/solr-8593. Currently there are a lot of 
tests in SQLHandler that are commented out. Need to go back through and make 
sure they have the expected results. Also some file formatting related to 
headers needs to be addressed.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-03-23 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208874#comment-15208874
 ] 

Kevin Risden edited comment on SOLR-8593 at 3/23/16 5:57 PM:
-

Another thing I learned from looking closely at the TestSQLHandler tests is 
that collection name is case insensitive with the current implementation. This 
seems wrong to me because collections are case sensitive. This is tested in the 
testMixedCaseFields method.


was (Author: risdenk):
Another thing I learned from looking closely at the TestSQLHandler tests is 
that collection name is case insensitive with the current implementation. This 
seems wrong to me because collections are case sensitive.

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-03-22 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207685#comment-15207685
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/23/16 1:29 AM:
---

Currently quoted identifiers do refer to columns. This was originally done 
because Presto didn't support mixed case columns unless they were quoted. But 
Presto fixed that problem. So the quoted identifiers as they are now don't 
really serve a purpose. But I do believe that both Presto and Calcite allow 
columns with quoted identifiers to support non parseable identifiers. 


was (Author: joel.bernstein):
Currently quoted identifiers do refer to columns. This was originally done 
because Presto didn't support mixed case columns unless they were quoted. But 
Presto fixed that problem. So the quoted identifiers as they are now don't 
really serve a purpose. But I do believe that both Presto and Calcite allow 
columns for quoted identifiers to support non parseable identifiers. 

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-03-22 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207593#comment-15207593
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/23/16 12:25 AM:


I looked through the code and I'm seeing how CloudSolrStream is being used. But 
it's not clear to me we'll be able to implement the full range of capabilities 
through this approach.

For example:

1) Can we choose between the FacetStream and a parallel RollupStream based on 
the costs of the different approaches?

2) Can we do parallel joins using Solr's shuffling capabilities and Solr 
workers?



was (Author: joel.bernstein):
I looked through the code and I'm seeing how CloudSolrStream is being used. But 
it's not clear to me we'll be able to implement that full range of capabilities 
through this approach.

For example:

1) Can we choose between the FacetStream and a parallel RollupStream based on 
the costs of the different approaches?

2) Can we do parallel joins using Solr's shuffling capabilities and Solr 
workers?


> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-03-22 Thread Kevin Risden (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206919#comment-15206919
 ] 

Kevin Risden edited comment on SOLR-8593 at 3/22/16 6:08 PM:
-

Here is a separate approach that uses all of Calcite and the JDBCStream: 
https://github.com/risdenk/lucene-solr/compare/master...risdenk:calcite-sql-handler

It removes all the custom processing from SQLHandler, wraps Calcite in a 
JDBCStream, and executes the query.

There is something I learned about TestSQLHandler that I'm not sure is correct:
* quoted identifiers - like 'id' and 'text' aren't valid? These shouldn't be 
referring to columns?

Things to be explored with this approach:
* switch from a standard query in SolrEnumerator to a stream
* fix data types
* optimize cases like where
* code cleanup since it was just thrown together as a POC


was (Author: risdenk):
Here is a separate approach that uses all of Calcite and the JDBCStream: 
https://github.com/risdenk/lucene-solr/compare/master...risdenk:calcite-sql-handler

It removes all the custom processing from SQLHandler, wraps Calcite in a 
JDBCStream, and executes the query.

There is something I learned about TestSQLHandler that I'm not sure is correct:
* quoted identifiers - like 'id' and 'text' aren't valid? These shouldn't be 
referring to columns?

Things to be explored with this approach:
* switch from a standard query in SolrEnumerator to a stream
* fix data types
* optimize cases like where

> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler

2016-03-22 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206730#comment-15206730
 ] 

Joel Bernstein edited comment on SOLR-8593 at 3/22/16 4:38 PM:
---

[~risdenk] and I have been looking into different approaches for this ticket. 

One of the approaches is to embed the Calcite SQL parser and optimizer inside 
the SQLHandler. The entry point for this appears to be:

https://calcite.apache.org/apidocs/org/apache/calcite/tools/Planner.html

Using this approach we would need to implement two things:

1) A CatalogReader, which the calcite validator and optimizer will use to do 
it's job. The underlying implementation of this should work for the JDBC driver 
also, so we kill two big birds with one stone when this is implemented.

2) A custom RelVisitor, which will rewrite the relational algebra tree 
(RelNode), created by the optimizer. The RelNode tree will need to be mapped to 
the Streaming API. Since the Streaming API already supports parallel relational 
algebra this should be fairly straight forward.

This approach would leave the Solr JDBC driver basically as it is, but provide 
all the hooks needed to finish off the remaining Catalog metadata methods.





was (Author: joel.bernstein):
[~risdenk] and I have been looking into different approaches for this ticket. 

One of the approaches is to embed the Calcite SQL parser and optimizer inside 
the SQLHandler. The entry point for this appears to be:

https://calcite.apache.org/apidocs/org/apache/calcite/tools/Planner.html

Using this approach we would need to implement two things:

1) A CatalogReader, which the calcite validator and optimizer will use to do 
it's job. The underlying implementation of this should work for the JDBC driver 
also, so we kill two big birds with one stone when this implemented.

2) A custom RelVisitor, which will rewrite the relational algebra tree 
(RelNode), created by the optimizer. The RelNode tree will need to be mapped to 
the Streaming API. Since the Streaming API already supports parallel relational 
algebra this should be fairly straight forward.

This approach would leave the Solr JDBC driver basically as it is, but provide 
all the hooks needed to finish off the remaining Catalog metadata methods.




> Integrate Apache Calcite into the SQLHandler
> 
>
> Key: SOLR-8593
> URL: https://issues.apache.org/jira/browse/SOLR-8593
> Project: Solr
>  Issue Type: Improvement
>Reporter: Joel Bernstein
> Fix For: master
>
>
> The Presto SQL Parser was perfect for phase one of the SQLHandler. It was 
> nicely split off from the larger Presto project and it did everything that 
> was needed for the initial implementation.
> Phase two of the SQL work though will require an optimizer. Here is where 
> Apache Calcite comes into play. It has a battle tested cost based optimizer 
> and has been integrated into Apache Drill and Hive.
> This work can begin in trunk following the 6.0 release. The final query plans 
> will continue to be translated to Streaming API objects (TupleStreams), so 
> continued work on the JDBC driver should plug in nicely with the Calcite work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

81 matches

Mail list logo