[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897978#comment-15897978 ] Joel Bernstein edited comment on SOLR-8593 at 3/6/17 8:12 PM: -- Ok, I added the assumption for the Turkish locale. I'm planning on resolving this ticket. If other issues come up with this the Apache Calcite integration we can open up a new issue. was (Author: joel.bernstein): Ok, I added the assumption for the Turkish locale. I'm planning on resolving this ticket. If other issue come up we can up a new issue. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15897978#comment-15897978 ] Joel Bernstein edited comment on SOLR-8593 at 3/6/17 8:12 PM: -- Ok, I added the assumption for the Turkish locale. I'm planning on resolving this ticket. If other issues come up with the Apache Calcite integration we can open up a new issue. was (Author: joel.bernstein): Ok, I added the assumption for the Turkish locale. I'm planning on resolving this ticket. If other issues come up with this the Apache Calcite integration we can open up a new issue. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895795#comment-15895795 ] Joel Bernstein edited comment on SOLR-8593 at 3/4/17 6:02 PM: -- Ok, I left this ticket open until I fixed this. I'll tackle this shortly. was (Author: joel.bernstein): Ok, I left this ticket until I fixed this. I'll tackle this shortly. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15894686#comment-15894686 ] Joel Bernstein edited comment on SOLR-8593 at 3/3/17 4:41 PM: -- This has now been backported for release in Solr 6.5. The jira number was not on the first backport commit so I'll link to it here: https://github.com/apache/lucene-solr/commit/3370dbed2e3e247a40012ab76aca059d640dfc80 This is a squash of the commits from master. Master still has the commit history from the feature branch. This squashed commit will much easier to revert if need be. was (Author: joel.bernstein): This has now been backported for release in Solr 6.5. The jira number was not on the first backport commit so I'll link to it here: https://github.com/apache/lucene-solr/commit/3370dbed2e3e247a40012ab76aca059d640dfc80 This is a squash of the commits from master. Master still has the commit history from the feature branch. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890784#comment-15890784 ] Julian Hyde edited comment on SOLR-8593 at 3/1/17 6:49 PM: --- [~risdenk], Regarding the Turkish locale issue. We have to explicitly pass user.timezone from maven into surefire (see [pom.xml|https://github.com/apache/calcite/blob/0372d23b847d4d145917dd786d1c9e3570cb8041/pom.xml#L733]), so I suspect we'd have to do the same with the locale. Can you log a Calcite case please? Even if we can't reproduce, I'd rather that we tracked it. was (Author: julianhyde): [~risdenk], Regarding the Turkish locale issue. We have to explicitly pass user.timezone from maven into surefire, so I suspect we'd have to do the same with the locale. Can you log a Calcite case please? Even if we can't reproduce, I'd rather that we tracked it. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15890431#comment-15890431 ] Joel Bernstein edited comment on SOLR-8593 at 3/1/17 3:56 PM: -- The backport is turning out be trickier then I thought. The reason is that jira/solr-8593 has lots of master commits in it that are not in branch_6x. So merging is not going to work. Also getting an isolated list of the 80+ commits done in jira/solr-8593 for the Calcite integration is tricky because the original pull request is closed. Creating a new pull request against branch_6x includes all the master commits. So I'm looking at ways to isolate all the commits to cherry-pick. was (Author: joel.bernstein): The backport is turning out be trickier then I thought. The reason is that jira/solr-8593 has lost of master commits in it that are not in branch_6x. So merging is not going to work. Also getting an isolated list of the 80+ commits done in jira/solr-8593 for the Calcite integration is tricky because the original pull request is closed. Creating a new pull request against branch_6x includes all the master commits. So I'm looking at ways to isolate all the commits to cherry-pick. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886352#comment-15886352 ] Joel Bernstein edited comment on SOLR-8593 at 2/27/17 7:19 PM: --- Ok great, I will add this assumption to the SQL tests. was (Author: joel.bernstein): Ok, great I will add this assumption to the SQL tests. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886352#comment-15886352 ] Joel Bernstein edited comment on SOLR-8593 at 2/27/17 7:19 PM: --- Ok, great I will add this assumption to the SQL tests. was (Author: joel.bernstein): Ok, great I will this assumption to the SQL tests. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886263#comment-15886263 ] Joel Bernstein edited comment on SOLR-8593 at 2/27/17 6:20 PM: --- It appears that this issue is due to the Turkish locale. https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ I believe this is due to how Calcite is handling locales. [~risdenk], curious if you see the same thing, which is that this in the Calcite code. [~steve_rowe], if it does turn out this is a Calcite issue I'll log an issue on the Calcite site and submit a patch. In the meantime is there a way to suppress the Turkish locale in the tests? was (Author: joel.bernstein): It appears that this issue is due to the Turkish locale. https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ I believe this is due to how Calcite is handling locales. [~risdenk], curious is you see the same thing, which is that this in the Calcite code. [~steve_rowe], if it does turn out this is a Calcite issue I'll log an issue on the Calcite site and submit a patch. In the meantime is there a way to suppress the Turkish locale in the tests? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876162#comment-15876162 ] Joel Bernstein edited comment on SOLR-8593 at 2/21/17 3:38 PM: --- I was thinking about merging https://github.com/apache/lucene-solr/tree/jira/solr-8593 into branch_6x rather then cherry picking from master. There is one commit that will need to be reverted because it's only valid in master, but that should be fairly easy to do I think. was (Author: joel.bernstein): I was thinking about merging https://github.com/apache/lucene-solr/tree/jira/solr-8593 into branch_6x rather then cherry picking from master. There is one commit that will need to be reverted because it's only valid in master, but that should fairly easy to do I think. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858327#comment-15858327 ] Joel Bernstein edited comment on SOLR-8593 at 2/8/17 6:16 PM: -- It was a bit of an odyssey but I was able to push down the HAVING clause. I pushed the commits out with my latest work to: https://github.com/apache/lucene-solr/tree/jira/solr-8593 was (Author: joel.bernstein): It was a bit of an odyssey but the I was able to push down the HAVING clause. I pushed the commits out with my latest work to: https://github.com/apache/lucene-solr/tree/jira/solr-8593 > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858327#comment-15858327 ] Joel Bernstein edited comment on SOLR-8593 at 2/8/17 6:10 PM: -- It was a bit of an odyssey but the I was able to push down the HAVING clause. I pushed the commits out with my latest work to: https://github.com/apache/lucene-solr/tree/jira/solr-8593 was (Author: joel.bernstein): It was a bit of an odyssey but the I was able to push down the HAVING clause. I pushed the commits out with my latest work: https://github.com/apache/lucene-solr/tree/jira/solr-8593 > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858327#comment-15858327 ] Joel Bernstein edited comment on SOLR-8593 at 2/8/17 6:08 PM: -- It was a bit of an odyssey but the I was able to push down the HAVING clause. I pushed the commits out with my latest work: https://github.com/apache/lucene-solr/tree/jira/solr-8593 was (Author: joel.bernstein): It was a bit of an odyssey but the I was able to push down the HAVING clause. I pushed the commits out to with my latest work: https://github.com/apache/lucene-solr/tree/jira/solr-8593 > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854454#comment-15854454 ] Joel Bernstein edited comment on SOLR-8593 at 2/6/17 5:59 PM: -- After adjusting the SolrFilterRule to allow filters on expressions, the having filter now shows up above the SolrAggregate: SolrFilter SolrAggregate SolrFilter SolrTableScan The first SolrFilter listed is the Having clause and second SolrFilter is the WHERE clause. The next step is to recognize the different types of filters and build a HavingStream for the having filter. I'll be working on plugging this in. was (Author: joel.bernstein): After adjusting the SolrFilterRule to allow filters on expressions, the having filter now shows up above the SolrAggregate: SolrFilter SolrAggregate SolrFilter SolrTableScan The first SolrFilter listed is the Having clause and second SolrFilter is the WHERE clause. The next step is to recognize the different types of filters and build a HavingStream for having filter. I'll be working on plugging this in. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854454#comment-15854454 ] Joel Bernstein edited comment on SOLR-8593 at 2/6/17 5:58 PM: -- After adjusting the SolrFilterRule to allow filters on expressions, the having filter now shows up above the SolrAggregate: SolrFilter SolrAggregate SolrFilter SolrTableScan The first SolrFilter listed in the Having clause and second SolrFilter is WHERE clause. The next step is to recognize the different types of filters and build a HavingStream for having filter. I'll be working on plugging this in. was (Author: joel.bernstein): After adjusting the SolrFilterRule to allow filters on expressions the having filter now shows up following the SolrAggregate: SolrFilter SolrAggregate SolrFilter SolrTableScan The first SolrFilter listed in the Having clause and second SolrFilter is WHERE clause. The next step is to recognize the different types of filters and build a HavingStream for having filter. I'll be working on plugging this in. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854454#comment-15854454 ] Joel Bernstein edited comment on SOLR-8593 at 2/6/17 5:58 PM: -- After adjusting the SolrFilterRule to allow filters on expressions, the having filter now shows up above the SolrAggregate: SolrFilter SolrAggregate SolrFilter SolrTableScan The first SolrFilter listed is the Having clause and second SolrFilter is the WHERE clause. The next step is to recognize the different types of filters and build a HavingStream for having filter. I'll be working on plugging this in. was (Author: joel.bernstein): After adjusting the SolrFilterRule to allow filters on expressions, the having filter now shows up above the SolrAggregate: SolrFilter SolrAggregate SolrFilter SolrTableScan The first SolrFilter listed in the Having clause and second SolrFilter is WHERE clause. The next step is to recognize the different types of filters and build a HavingStream for having filter. I'll be working on plugging this in. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849160#comment-15849160 ] Joel Bernstein edited comment on SOLR-8593 at 2/2/17 12:16 AM: --- Hi, [~julianhyde], The exact problem I'm seeing is that the SolrSort is not included in the query plan unless an ORDER BY is used in the query. With the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrSort org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan Without the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan was (Author: joel.bernstein): Hi, [~julianhyde], The exact problem I'm seeing is that the SolrSort is not included in the query plan unless an ORDER BY is used in the query. With the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrSort org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan Without the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849160#comment-15849160 ] Joel Bernstein edited comment on SOLR-8593 at 2/2/17 12:17 AM: --- Hi [~julianhyde], The exact problem I'm seeing is that the SolrSort is not included in the query plan unless an ORDER BY is used in the query. With the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrSort org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan Without the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan was (Author: joel.bernstein): Hi, [~julianhyde], The exact problem I'm seeing is that the SolrSort is not included in the query plan unless an ORDER BY is used in the query. With the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrSort org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan Without the ORDER BY our tree looks like this: org.apache.solr.handler.sql.SolrProject org.apache.solr.handler.sql.SolrTableScan > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849145#comment-15849145 ] Julian Hyde edited comment on SOLR-8593 at 2/2/17 12:05 AM: Not sure I understand. The query {{select a from b limit 10}} will have a {{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will be translated to a {{SolrSort}} with similar attributes. The sort is trivial - that is, you don't need to do any work to sort on 0 fields - but you do need to apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but maybe convert into a {{SolrLimit}} if you have such a thing. You may be wondering why we combine sort and limit into the same operator. But remember that relational data sets are inherently unordered, so we have to do them at the same time. Sort with an empty key has reasonable semantics, just as -- I hope you agree -- Aggregate with an empty key (e.g. {{select count\(\*\) from emp}}, which is equivalent to {{select count\(\*\) from emp group by ()}}) is a reasonable generalization of Aggregate. was (Author: julianhyde): Not sure I understand. The query {{select a from b limit 10}} will have a {{Sort}} whose key has zero fields but which has fetch = 10. The {{Sort}} will be translated to a {{SolrSort}} with similar attributes. The sort is trivial - that is, you don't need to do any work to sort on 0 fields - but you do need to apply the limit. If you see a {{SolrSort}} with empty keys, don't drop it, but maybe convert into a {{SolrLimit}} if you have such a thing. You may be wondering why we combine sort and limit into the same operator. But remember that relational data sets are inherently unordered, so we have to do them at the same time. Sort with an empty key has reasonable semantics, just as -- I hope you agree -- Aggregate with an empty key (e.g. {{select count(*) from emp}}, which is equivalent to {{select count(*) from emp group by ()}}) is a reasonable generalization of Aggregate. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856 ] Joel Bernstein edited comment on SOLR-8593 at 2/1/17 8:16 PM: -- We've run into a bug with pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. The issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? Here is the code where create the rule: https://github.com/apache/lucene-solr/blob/jira/solr-8593/solr/core/src/java/org/apache/solr/handler/sql/SolrRules.java#L184 was (Author: joel.bernstein): We've run into a bug with pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. The issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856 ] Joel Bernstein edited comment on SOLR-8593 at 2/1/17 7:51 PM: -- We've run into a bug with pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. The issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? was (Author: joel.bernstein): We've run into a bug with pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. This issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856 ] Joel Bernstein edited comment on SOLR-8593 at 2/1/17 7:50 PM: -- We've run into a bug iwith pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. This issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? was (Author: joel.bernstein): We've run into a bug in with pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. This issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848856#comment-15848856 ] Joel Bernstein edited comment on SOLR-8593 at 2/1/17 7:50 PM: -- We've run into a bug with pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. This issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? was (Author: joel.bernstein): We've run into a bug iwith pushing down of the SQL LIMIT clause. Currently we have code that handles the limit in the SolrSort class which extends org.apache.calcite.rel.core.Sort. This issue is that the SolrSort rule is only executed if an ORDER BY clause is included. So if LIMIT is used without an ORDER BY our code does not see the LIMIT. So limit works in this scenario: *select a from b order by a limit 10* but not this scenario: *select a from b limit 10* [~julianhyde], any ideas on how to resolve this? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement > Components: Parallel SQL >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Fix For: 6.5, master (7.0) > > Attachments: SOLR-8593.patch, SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848522#comment-15848522 ] Joel Bernstein edited comment on SOLR-8593 at 2/1/17 3:36 PM: -- I pulled the most recent work on this branch and precommit is passing. I'll be doing manual testing today and also running the full test suite. It seems like we are getting very close to committing this. I haven't seen the protobuf issue that [~julianhyde] mentioned above. But I'll keep testing and see if anything pops up. [~risdenk], any preference on who does the merge/commit to master on this? was (Author: joel.bernstein): I pulled the most recent work on this branch and precommit is passing. I'll be doing manual testing today and also running the full test suite. It seems like we are getting very close to committing this. I haven't seen the protobuf issue that [~julianhyde] mentioned above. But I'll keep testing and see if anything pops up. [~risdenk], any preference on who does the commit on this? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848522#comment-15848522 ] Joel Bernstein edited comment on SOLR-8593 at 2/1/17 3:35 PM: -- I pulled the most recent work on this branch and precommit is passing. I'll be doing manual testing today and also running the full test suite. It seems like we are getting very close to committing this. I haven't seen the protobuf issue that [~julianhyde] mentioned above. But I'll keep testing and see if anything pops up. [~risdenk], any preference on who does the commit on this? was (Author: joel.bernstein): I pulled the most recent work on this branch and precommit is passing. I'll be doing manual testing today and also running the full test suite. It seems like we getting very close to committing this. I haven't seen the issue that [~julianhyde] mentioned above. But I'll keep testing and see if anything pops up. [~risdenk], any preference on who does the commit on this? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796107#comment-15796107 ] Joel Bernstein edited comment on SOLR-8593 at 1/3/17 8:33 PM: -- It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently planning to take the HavingStream from this branch and commit it with SOLR-8530 for Solr 6.4 This is going to cause merge conflicts between jira/solr-8593 and master. As soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593 to master and work out the conflicts. So Calcite will be in master at the very beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code in master and perform the backport to branch_6x when ready. was (Author: joel.bernstein): It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently planning to take the HavingStream from this branch and commit it with SOLR-8530 for Solr 6.4 This is going to cause merge conflicts between jira/solr-8593 and master. As soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593 to master and work out the conflicts. So Calcite will be in master at very beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code more in master and perform the backport to branch_6x when ready. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15796107#comment-15796107 ] Joel Bernstein edited comment on SOLR-8593 at 1/3/17 8:32 PM: -- It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently planning to take the HavingStream from this branch and commit it with SOLR-8530 for Solr 6.4 This is going to cause merge conflicts between jira/solr-8593 and master. As soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593 to master and work out the conflicts. So Calcite will be in master at very beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code more in master and perform the backport to branch_6x when ready. was (Author: joel.bernstein): It doesn't look like we're going to get this in for Solr 6.4. So, I'm currently planning to take the HavingStream from this branch an commit it with SOLR-8530 for Solr 6.4 This is going to cause merge conflicts between jira/solr-8593 and master. As soon as the Calcite release is cut I'll begin the work to merge jira/solr-8593 to master and work out the conflicts. So Calcite will be in master at very beginning of the dev cycle for Solr 6.5. Then we can work with the Calcite code more in master and perform the backport to branch_6x when ready. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773344#comment-15773344 ] Joel Bernstein edited comment on SOLR-8593 at 12/23/16 5:41 PM: I think I have a handle now on how the *string* and *arithmetic* functions work. I was expecting them to work automatically and Calcite would perform the functions if we returned data in the fields. I now believe that is not the case, because we've pushed down the *projection*. Because we've pushed down the projection we'll need to implement the arithmetic and string functions using the *SelectStream*. What Calcite provides in the project rule is access to the parse tree so we have enough information to implement the functions. Since this ticket was mainly about getting parity with the current SQL functionality, I think it makes sense to tackle the string and arithmetic functions in a separate ticket. I will create that ticket. was (Author: joel.bernstein): I think I have a handle now on how the *string* and *arithmetic* functions work. I was expecting them to work automatically and Calcite would perform the functions if we returned data in the fields. I now believe that is not the case, because we've pushed down the projection. Because we've pushed down the *projection* we'll need to implement the arithmetic and string functions using the *SelectStream*. What Calcite provides in the project rule is access to the parse tree so we have enough information to implement the functions. Since this ticket was mainly about getting parity with the current SQL functionality, I think it makes sense to tackle the string and arithmetic functions in a separate ticket. I will create that ticket. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773344#comment-15773344 ] Joel Bernstein edited comment on SOLR-8593 at 12/23/16 5:40 PM: I think I have a handle now on how the *string* and *arithmetic* functions work. I was expecting them to work automatically and Calcite would perform the functions if we returned data in the fields. I now believe that is not the case, because we've pushed down the projection. Because we've pushed down the *projection* we'll need to implement the arithmetic and string functions using the *SelectStream*. What Calcite provides in the project rule is access to the parse tree so we have enough information to implement the functions. Since this ticket was mainly about getting parity with the current SQL functionality, I think it makes sense to tackle the string and arithmetic functions in a separate ticket. I will create that ticket. was (Author: joel.bernstein): I think I have a handle now on how the *string* and *arithmetic* functions work. I was expecting them to work automatically and Calcite would perform the functions if we returned data in the fields. I now believe that is not the case, because we've pushed down the projection. Because we've pushed down the projection we'll need to implement the arithmetic and string functions using the SelectStream. What Calcite provides in the project rule is access to the parse tree so we have enough information to implement the functions. Since this ticket was mainly about getting parity with the current SQL functionality, I think it makes sense to tackle the string and arithmetic functions in a separate ticket. I will create that ticket. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15773344#comment-15773344 ] Joel Bernstein edited comment on SOLR-8593 at 12/23/16 5:38 PM: I think I have a handle now on how the *string* and *arithmetic* functions work. I was expecting them to work automatically and Calcite would perform the functions if we returned data in the fields. I now believe that is not the case, because we've pushed down the projection. Because we've pushed down the projection we'll need to implement the arithmetic and string functions using the SelectStream. What Calcite provides in the project rule is access to the parse tree so we have enough information to implement the functions. Since this ticket was mainly about getting parity with the current SQL functionality, I think it makes sense to tackle the string and arithmetic functions in a separate ticket. I will create that ticket. was (Author: joel.bernstein): I think I have a handle now on how the *string* and *arithmetic* functions work. I was expecting them to work automatically and Calcite would perform the functions if we returned the fields. I now believe that is not the case, because we've pushed down the projection. Because we've pushed down the projection we'll need to implement the arithmetic and string functions using the SelectStream. What Calcite provides in the project rule is access to the parse tree so we have enough information to implement the functions. Since this ticket was mainly about getting parity with the current SQL functionality, I think it makes sense to tackle the string and arithmetic functions in a separate ticket. I will create that ticket. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15770997#comment-15770997 ] Joel Bernstein edited comment on SOLR-8593 at 12/22/16 8:18 PM: [~risdenk], I see you've added some special logic to get the *cast* function working. It's not clear to me though what the code is actually doing. Can you explain what's happening in the SolrRules.RexToSolrTranslator class with the cast function? If possible I'd like to hook up the other string and arithmetic functions. was (Author: joel.bernstein): [~risdenk], I see you've added some special logic to get the *cast* function working. It's not clear to though exactly what's doing. Can you explain what's happening in the SolrRules.RexToSolrTranslator class with the cast function? If possible I'd like to hook up the other string and arithmetic functions. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15761872#comment-15761872 ] Joel Bernstein edited comment on SOLR-8593 at 12/19/16 6:13 PM: It wasn't clear to me that we were using Avatica yet? It seemed we were using org.apache.calcite imports. Are there dependencies within Caclite core on Avatica? was (Author: joel.bernstein): It wasn't clear to me that we were using Avatica yet? It seemed we were using org.apache.calcite imports. Are their dependencies within Caclite core on Avatica? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15761834#comment-15761834 ] Joel Bernstein edited comment on SOLR-8593 at 12/19/16 5:57 PM: Are we blocked until Calcite 0.11.0 releases or is their president for using a snapshot release? With the protobuf upgrade I think as long all tests are passing it should be ok to upgrade. was (Author: joel.bernstein): Are blocked until Calcite 0.11.0 releases or is there president for using a snapshot release? With the protobuf upgrade I think as long all tests are passing it should be ok to upgrade. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15753387#comment-15753387 ] Julian Hyde edited comment on SOLR-8593 at 12/16/16 4:21 AM: - Would it be correct to say that you have a physical operator which is a combination of Aggregate and TopN? This physical operator would have a sorted list of grouping fields and also a parameter N (which affects the cost estimate). Maybe it's a sub-class of Aggregate with some extra fields. It could be created by a planner rule that matches a Sort (with limit) on top of an Aggregate and also looks at estimated cardinality of the fields in order to sort them. was (Author: julianhyde): Would it be correct to say that you have a physical operator which is a combination of Aggregate and TopN? This physical operator would have a sorted list of grouping fields and also a parameter N (which affects the cost estimate). Maybe it's a sub-class of Aggregate with some extra fields. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805 ] Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:15 PM: -- I wasn't familiar with faceting, but I quickly read https://wiki.apache.org/solr/SolrFacetingOverview. Suppose table T has fields a, b, c, d, and you want to do a faceted search on b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} then you will end up with {code} Project($1, $0, $2) Aggregate({0, 1}, COUNT(*)) Scan(table=T) {code} and as you correctly say, {{0, 1}} represents {{a, b}} because that is the physical order of the columns. Can you explain why the faceting algorithm is interested in the order of the columns? Is it because it needs to produce the output ordered or nested on those columns? If so, we can rephrase the SQL query so that we are accurately expressing in relational algebra what we need. was (Author: julianhyde): I wasn't familiar with faceting, but I quickly read https://wiki.apache.org/solr/SolrFacetingOverview. Suppose table T has fields a, b, c, d, and you want to do a faceted search on b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} then you will end up with {code} Project($1, $0, $2) Aggregate({0, 1}, COUNT(*)) Scan(table=T) {code} and as you correctly say, {{ \{0, 1\} }} represents {{ \{a, b\} }} because that is the physical order of the columns. Can you explain why the faceting algorithm is interested in the order of the columns? Is it because it needs to produce the output ordered or nested on those columns? If so, we can rephrase the SQL query so that we are accurately expressing in relational algebra what we need. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805 ] Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:14 PM: -- I wasn't familiar with faceting, but I quickly read https://wiki.apache.org/solr/SolrFacetingOverview. Suppose table T has fields a, b, c, d, and you want to do a faceted search on b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} then you will end up with {code} Project($1, $0, $2) Aggregate({0, 1}, COUNT(*)) Scan(table=T) {code} and as you correctly say, {{ \{0, 1\} }} represents {{ \{a, b\} }} because that is the physical order of the columns. Can you explain why the faceting algorithm is interested in the order of the columns? Is it because it needs to produce the output ordered or nested on those columns? If so, we can rephrase the SQL query so that we are accurately expressing in relational algebra what we need. was (Author: julianhyde): I wasn't familiar with faceting, but I quickly read https://wiki.apache.org/solr/SolrFacetingOverview. Suppose table T has fields a, b, c, d, and you want to do a faceted search on b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} then you will end up with {code} Project($1, $0, $2) Aggregate({0, 1}, COUNT(*)) Scan(table=T) {code} and as you correctly say, {0, 1} represents {a, b} because that is the physical order of the columns. Can you explain why the faceting algorithm is interested in the order of the columns? Is it because it needs to produce the output ordered or nested on those columns? If so, we can rephrase the SQL query so that we are accurately expressing in relational algebra what we need. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752805#comment-15752805 ] Julian Hyde edited comment on SOLR-8593 at 12/15/16 11:13 PM: -- I wasn't familiar with faceting, but I quickly read https://wiki.apache.org/solr/SolrFacetingOverview. Suppose table T has fields a, b, c, d, and you want to do a faceted search on b, a. If you issue the query {{select b, a, count\(*) from t group by b, a}} then you will end up with {code} Project($1, $0, $2) Aggregate({0, 1}, COUNT(*)) Scan(table=T) {code} and as you correctly say, {0, 1} represents {a, b} because that is the physical order of the columns. Can you explain why the faceting algorithm is interested in the order of the columns? Is it because it needs to produce the output ordered or nested on those columns? If so, we can rephrase the SQL query so that we are accurately expressing in relational algebra what we need. was (Author: julianhyde): I wasn't familiar with faceting, but I quickly read https://wiki.apache.org/solr/SolrFacetingOverview. Suppose table T has fields a, b, c, d, and you want to do a faceted search on b, a. If you issue the query {{select b, a, count(*) from t group by b, a}} then you will end up with {code} Project($1, $0, $2) Aggregate({0, 1}, COUNT(*)) Scan(table=T) {code} and as you correctly say, {0, 1} represents {a, b} because that is the physical order of the columns. Can you explain why the faceting algorithm is interested in the order of the columns? Is it because it needs to produce the output ordered or nested on those columns? If so, we can rephrase the SQL query so that we are accurately expressing in relational algebra what we need. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752570#comment-15752570 ] Joel Bernstein edited comment on SOLR-8593 at 12/15/16 9:33 PM: I just pushed out a commit to the solr/jira-8593 branch: https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f This is a pretty large refactoring of the SolrTable class which includes implementations for aggregationMode=facet for both GROUP BY aggregations and SELECT DISTINCT aggregations. All tests in TestSQLHandler are passing. There is only one thing that I'm not quite happy about in this patch which is specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the issue. The specific issue deals with how the set of GROUP BY fields are handled in Calcite. From what I can see there isn't an easy way to get the ordering of the GROUP BY fields preserved from the query. The Solr faceting implementations requires the correct order of the GROUP BY fields to return a correct response. So, I'm getting the ordering from the field list of the query instead. This may actually be the correct approach from a SQL standpoint but I was wondering what Julian thought about this issue. was (Author: joel.bernstein): I just pushed out a commit to the solr/jira-8593 branch: https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f This is a pretty large refactoring of the SolrTable class which includes implementations for aggregationMode=facet for both GROUP BY aggregations and SELECT DISTINCT aggregations. All tests in TestSQLHandler are passing. There is only one thing that I'm not quite happy about in this patch which is specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the issue. The specific issue deals with how the set of GROUP BY fields is dealt with in Calcite. From what I can see there isn't an easy way to get the ordering of the GROUP BY fields preserved from the query. The Solr faceting implementations requires the correct order of the GROUP BY fields to return a correct response. So, I'm getting the ordering from the field list of the query instead. This may actually be the correct approach from a SQL standpoint but I was wondering what Julian thought about this issue. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752570#comment-15752570 ] Joel Bernstein edited comment on SOLR-8593 at 12/15/16 9:32 PM: I just pushed out a commit to the solr/jira-8593 branch: https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f This is a pretty large refactoring of the SolrTable class which includes implementations for aggregationMode=facet for both GROUP BY aggregations and SELECT DISTINCT aggregations. All test in the TestSQLHandler are passing. There is only one thing that I'm not quite happy about in this patch which is specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the issue. The specific issue deals with how the set of GROUP BY fields is dealt with in Calcite. From what I can see there isn't an easy way to get the ordering of the GROUP BY fields preserved from the query. The Solr faceting implementations requires the correct order of the GROUP BY fields to return a correct response. So, I'm getting the ordering from the field list of the query instead. This may actually be the correct approach from a SQL standpoint but I was wondering what Julian thought about this issue. was (Author: joel.bernstein): I just pushed out a commit to the solr/jira-8593 branch: https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f This is a pretty large refactoring of the SolrTable class which includes implementations for aggregationMode=facet for both GROUP BY aggregations and SELECT DISTINCT aggregations. All test is TestSQLHandler are passing. There is only one thing that I'm not quite happy about in this patch which is specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the issue. The specific issue deals with how the set of GROUP BY fields is dealt with in Calcite. From what I can see there isn't an easy way to get the ordering of the GROUP BY fields preserved from the query. The Solr faceting implementations requires the correct order of the GROUP BY fields to return a correct response. So, I'm getting the ordering from the field list of the query instead. This may actually be the correct approach from a SQL standpoint but I was wondering what Julian thought about this issue. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752570#comment-15752570 ] Joel Bernstein edited comment on SOLR-8593 at 12/15/16 9:32 PM: I just pushed out a commit to the solr/jira-8593 branch: https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f This is a pretty large refactoring of the SolrTable class which includes implementations for aggregationMode=facet for both GROUP BY aggregations and SELECT DISTINCT aggregations. All tests in TestSQLHandler are passing. There is only one thing that I'm not quite happy about in this patch which is specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the issue. The specific issue deals with how the set of GROUP BY fields is dealt with in Calcite. From what I can see there isn't an easy way to get the ordering of the GROUP BY fields preserved from the query. The Solr faceting implementations requires the correct order of the GROUP BY fields to return a correct response. So, I'm getting the ordering from the field list of the query instead. This may actually be the correct approach from a SQL standpoint but I was wondering what Julian thought about this issue. was (Author: joel.bernstein): I just pushed out a commit to the solr/jira-8593 branch: https://github.com/apache/lucene-solr/commit/37fdc37fc3d88054634482d39b5774893751f91f This is a pretty large refactoring of the SolrTable class which includes implementations for aggregationMode=facet for both GROUP BY aggregations and SELECT DISTINCT aggregations. All test in the TestSQLHandler are passing. There is only one thing that I'm not quite happy about in this patch which is specific to Calcite. I am wondering if [~julianhyde] has any thoughts on the issue. The specific issue deals with how the set of GROUP BY fields is dealt with in Calcite. From what I can see there isn't an easy way to get the ordering of the GROUP BY fields preserved from the query. The Solr faceting implementations requires the correct order of the GROUP BY fields to return a correct response. So, I'm getting the ordering from the field list of the query instead. This may actually be the correct approach from a SQL standpoint but I was wondering what Julian thought about this issue. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648 ] Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:17 PM: --- I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some of the logic for how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is to implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigating how SELECT DISTINCT behaves with our Calcite implementation. As [~julianhyde] mentioned, we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. was (Author: joel.bernstein): I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some of the logic for how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is to implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648 ] Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:16 PM: --- I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some of the logic for how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is to implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. was (Author: joel.bernstein): I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some of the logic for how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648 ] Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM: --- I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some of the logic for how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. was (Author: joel.bernstein): I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15722648#comment-15722648 ] Joel Bernstein edited comment on SOLR-8593 at 12/5/16 4:15 PM: --- I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release and the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current patch doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. was (Author: joel.bernstein): I wanted to give an update on my work on this ticket. I've started working my way through the test cases (TestSQLHandler). I'm working through each assertion in each method to understand the differences between the current release the work done in this patch, and making changes/fixes as I go. The first change that I made was in how the predicate is being traversed. The current pant doesn't descend through a full nested AND/OR predicate. So I made a few changes to how the tree is walked. I also changed some how the query is re-written to a Lucene/Solr query so that it matches the current implementation. I've now moved on to aggregate queries. I've been investigating the use of EXPR$1 ... instead of using the *function signature* in the result set. It looks like we'll have to use the Caclite expression identifiers going forward, which should be OK. I think this is cleaner anyway because looking up fields by a function signature can get cumbersome. We'll just need to document this in the CHANGES.txt. The next step for me is implement the aggregationMode=facet logic for aggregate queries. After that I'll push out my changes to this branch. Then I'll spend some time investigation how SELECT distinct behaves in our implementation. As [~julianhyde] mentioned. we should see DISTINCT queries as aggregate queries so it's possible we'll have all the code in place to push this to Solr already. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch, SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:38 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: *having*(streamExpr, *and*(*eq*(field1, value1), *not*(*eq*(field2, value2 The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. was (Author: joel.bernstein): One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: *having*(streamExpr, *and*(*eq*(field1, value1), *eq*(field2, value2))) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:37 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: *having*(streamExpr, *and*(*eq*(field1, value1), *eq*(field2, value2))) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. was (Author: joel.bernstein): One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(eq(field1, value1), eq(field2, value2))) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:36 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(eq(field1, value1), eq(field2, value2))) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. was (Author: joel.bernstein): One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(eq(fieldName, value)) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:34 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(eq(fieldName, value)) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. was (Author: joel.bernstein): One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(equals(fieldName, value)) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:33 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My plan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(equals(fieldName, value)) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. was (Author: joel.bernstein): One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My pan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(equals(fieldName, value)) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:32 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. My pan is to implement the following classes for the having logic: HavingStream BooleanOperation AndOperation OrOperation NotOperation EqualsOperation LessThenOperation GreaterThenOperation Syntax: having(streamExpr, and(equals(fieldName, value)) The having function will read the Tuples from the streamExpr and apply the boolean operation to each Tuple. If the boolean operation returns true the having stream will emit the Tuple. was (Author: joel.bernstein): One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709176#comment-15709176 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 5:24 PM: One of things that is also not specifically handled is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. was (Author: joel.bernstein): One of things that is also not specifically handing is the HAVING clause. I think we should push down this capability to Solr as well so we can perform the HAVING logic on the worker nodes. In high cardinality use cases this will be a big performance improvement. We also need to develop a HavingStream to manage the having logic. I'll start the work for the HavingStream in this branch as it directly supports the Calcite integration. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708937#comment-15708937 ] Joel Bernstein edited comment on SOLR-8593 at 11/30/16 3:58 PM: I've started to work on this ticket. As a first step I'm doing some refactoring on the SolrTable class to create methods for handling the different types of queries. After that I'll get the aggregationModes hooked up. was (Author: joel.bernstein): I've started to work on this ticket. As a first step I'm doing some refactoring to create methods for handling the different types of queries. After that I'll get the aggregationModes hooked up. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700686#comment-15700686 ] Joel Bernstein edited comment on SOLR-8593 at 11/28/16 2:34 AM: Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket and it's looking really good! A couple pieces of functionality that appear to be missing: 1) Specific handling of SELECT DISTINCT queries. In the current SQLHandler we can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we can also push down the distinct logic to the JSON Facet API. 2) The pushing down of GROUP BY aggregations to the JSON Facet API. Both of these currently require the aggregationMode parameter to be passed in with the query, which I think is fine for the initial Calcite release. I'd be happy to add these capabilities to this branch. That will also give me an opportunity to work with the code and feel comfortable working with Calcite. was (Author: joel.bernstein): Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket and it's looking really good! A couple pieces of functionality that appear to be missing are: 1) Specific handling of SELECT DISTINCT queries. In the current SQLHandler we can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we can also push down the distinct logic to the JSON Facet API. 2) The pushing down of GROUP BY aggregations to the JSON Facet API. Both of these currently require the aggregationMode parameter to be passed in with the query, which I think is fine for the initial Calcite release. I'd be happy to add these capabilities to this branch. That will also give me an opportunity to work with the code and feel comfortable working with Calcite. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15700686#comment-15700686 ] Joel Bernstein edited comment on SOLR-8593 at 11/28/16 2:07 AM: Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket and it's looking really good! A couple pieces of functionality that appear to be missing are: 1) Specific handling of SELECT DISTINCT queries. In the current SQLHandler we can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we can also push down the distinct logic to the JSON Facet API. 2) The pushing down of GROUP BY aggregations to the JSON Facet API. Both of these currently require the aggregationMode parameter to be passed in with the query, which I think is fine for the initial Calcite release. I'd be happy to add these capabilities to this branch. That will also give me an opportunity to work with the code and feel comfortable working with Calcite. was (Author: joel.bernstein): Hi [~risdenk] and [~caomanhdat]. I've reviewed the latest work on this ticket and it's looking really good! A couple pieces of functionality that appear to be missing are: 1) Handling of SELECT DISTINCT queries. In the current SQLHandler we can do MapReduce SELECT DISTINCT queries in parallel on worker nodes. And we can also push down the distinct logic to the JSON Facet API. 2) The pushing down of GROUP BY aggregations to the JSON Facet API. Both of these currently require the aggregationMode parameter to be passed in with the query, which I think is fine for the initial Calcite release. I'd be happy to add these capabilities to this branch. That will also give me an opportunity to work with the code and feel comfortable working with Calcite. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664613#comment-15664613 ] Cao Manh Dat edited comment on SOLR-8593 at 11/14/16 6:30 PM: -- But we wanna to handle having clause without the function like ( Because Solr will run this filter faster than Calcite ) {code} having field_i = 19 {code} and left the other cases for Calcite to handle. Is there any better ways to do this kind of filter? was (Author: caomanhdat): But we wanna to handle having clause without the function like {code} having field_i = 19 {code} Is there any better ways to do this kind of filter? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15664576#comment-15664576 ] Julian Hyde edited comment on SOLR-8593 at 11/14/16 6:11 PM: - Regarding the alias for "count(\*)". I guess one approach is to extend Calcite to allow a pluggable alias derivation (it has to be pluggable because you can't please everyone). Another approach is to leave the aliases as they are but generate field names for the JSON result set. Note that if you call SqlNode.getParserPosition() on each item in the select clause it will tell you the start and end point of that expression in the original SQL string, so you can extract the "count(\*)" using that information. I don't think the the following should be valid, but under your proposed change it would be: {code} SELECT deptno FROM ( SELECT deptno, count(*) FROM emp GROUP BY deptno) AS t WHERE t."count(*)" > 3 {code} Note that "count(\*)" is not an expression; it is a reference to a "column" produced by the sub-query. In my opinion, using a textual expression is very confusing, and we should not do it. Derived alias of {{count(\*)}} should be something not easily guessable, which will encourage users to use an alias: {code} SELECT deptno FROM ( SELECT deptno, count(*) AS c FROM emp GROUP BY deptno) AS t WHERE t.c > 3 {code} was (Author: julianhyde): Regarding the alias for "count(*)". I guess one approach is to extend Calcite to allow a pluggable alias derivation (it has to be pluggable because you can't please everyone). Another approach is to leave the aliases as they are but generate field names for the JSON result set. Note that if you call SqlNode.getParserPosition() on each item in the select clause it will tell you the start and end point of that expression in the original SQL string, so you can extract the "count(*)" using that information. I don't think the the following should be valid, but under your proposed change it would be: {code} SELECT deptno FROM ( SELECT deptno, count(\*) FROM emp GROUP BY deptno) AS t WHERE t."count(*)" > 3 {code} Note that "count(\*)" is not an expression; it is a reference to a "column" produced by the sub-query. In my opinion, using a textual expression is very confusing, and we should not do it. Derived alias of {{count(\*)}} should be something not easily guessable, which will encourage users to use an alias: {code} SELECT deptno FROM ( SELECT deptno, count(\*) AS c FROM emp GROUP BY deptno) AS t WHERE t.c > 3 {code} > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15661016#comment-15661016 ] Cao Manh Dat edited comment on SOLR-8593 at 11/13/16 7:12 AM: -- Thanks [~risdenk] [~julianhyde] BTW: How can we get rid of EXPR$1 return in {code} select str_s, count(*) from collection1 {code} the result set return the name of second column as {{EXPR$1}}, not {{count( * )}} as expected was (Author: caomanhdat): Thanks [~risdenk] [~julianhyde] BTW: How can we get rid of EXPR$1 return in {code} select str_s, count(*) from collection1 {code} the result set return the name of second column as {{EXPR$1}}, {{not count( * )}} as expected > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15661016#comment-15661016 ] Cao Manh Dat edited comment on SOLR-8593 at 11/13/16 7:11 AM: -- Thanks [~risdenk] [~julianhyde] BTW: How can we get rid of EXPR$1 return in {code} select str_s, count(*) from collection1 {code} the result set return the name of second column as {{EXPR$1}}, {{not count( * )}} as expected was (Author: caomanhdat): Thanks [~risdenk] [~julianhyde] BTW: How can we get rid of EXPR$1 return in {code} select str_s, count(*) from collection1 {code} the result set return the name of second column as {{EXPR$1}}, {{not count(*)}} as expected > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15659361#comment-15659361 ] Cao Manh Dat edited comment on SOLR-8593 at 11/12/16 9:25 AM: -- Patch based on pull/104. Fixed most of the failed tests ( some tests is modified a little bit to more accurate ). There are something we have to do, but it pretty close now. was (Author: caomanhdat): Patch based on pull/104. Fixed most of the failed tests ( some tests is modified a little bit to more accurate ). > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > Attachments: SOLR-8593.patch > > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658025#comment-15658025 ] Julian Hyde edited comment on SOLR-8593 at 11/11/16 8:01 PM: - By the way, when you're ready, add please Solr to the [powered by Calcite|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 for details. was (Author: julianhyde): By the way, when you're ready, add please Solr to the [powered by|https://calcite.apache.org/docs/powered_by.html] page; see CALCITE-1112 for details. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652093#comment-15652093 ] Joel Bernstein edited comment on SOLR-8593 at 11/9/16 9:32 PM: --- Very excited about this patch. After a day of review I think understand how this comes together. Next step is for me to get it running. Initial run of the test cases fail, probably due to the classpath issue that [~risdenk] mentions. I'll work on getting it running as the next step and compare the feature set with the current release. was (Author: joel.bernstein): Very excited about this patch. After a day of review I think understand how this comes together. Next step is for me to get it running. Initial run of the test cases fail, probably due to classpath issue that [~risdenk] mentions. I'll work on getting it running as the next step and compare the feature set with the current release. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Assignee: Joel Bernstein > >The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15513607#comment-15513607 ] Kevin Risden edited comment on SOLR-8593 at 10/4/16 6:31 PM: - Adding some resources that may be helpful: * http://www.slideshare.net/HadoopSummit/costbased-query-optimization * https://medium.com/@mpathirage/query-planning-with-apache-calcite-part-1-fe957b011c36#.ywd9ouxmv * http://www.slideshare.net/JordanHalterman/introduction-to-apache-calcite was (Author: risdenk): Adding some resources that may be helpful: * http://www.slideshare.net/HadoopSummit/costbased-query-optimization * https://medium.com/@mpathirage/query-planning-with-apache-calcite-part-1-fe957b011c36#.ywd9ouxmv > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378775#comment-15378775 ] Joel Bernstein edited comment on SOLR-8593 at 7/15/16 2:53 AM: --- Ah, I see. I think we can pull off CoGroup with a merge() and reduce(). {code} reduce(merge(search(..., sort="a asc"), search(..., sort="a asc"), on="a asc")), by="a", group(sort="b asc", n="5")) {code} This operation is more generic in that it could be an aggregation or a join depending on what comes after the reduce. Fun stuff. was (Author: joel.bernstein): Ah, I see. I think we can pull off CoGroup with a merge() and reduce(). {code} reduce(merge(search(..., sort="a asc"), search(..., sort="a asc"), on="a asc")), by="a", group(sort="b asc", n="5")) {code} This operation is more generic in that it could be an aggregation or a join depending what comes after the reduce. Fun stuff. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378775#comment-15378775 ] Joel Bernstein edited comment on SOLR-8593 at 7/15/16 2:52 AM: --- Ah, I see. I think we can pull off CoGroup with a merge() and reduce(). {code} reduce(merge(search(..., sort="a asc"), search(..., sort="a asc"), on="a asc")), by="a", group(sort="b asc", n="5")) {code} This operation is more generic in that it could be an aggregation or a join depending what comes after the reduce. Fun stuff. was (Author: joel.bernstein): Ah, I see. I think we can pull off CoGroup with a merge() and reduce(). {code} reduce(merge(search(..., sort="a asc"), search(..., sort="a asc"), on="a asc")), by="a", group(sort="b asc", n="5")) {code} This operation is more generic in that it could be an aggregation or a join depending what comes after the reduce. Fun stuff. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665 ] Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:33 AM: The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expression that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. was (Author: joel.bernstein): The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665 ] Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:31 AM: The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. was (Author: joel.bernstein): The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665 ] Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:31 AM: The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. was (Author: joel.bernstein): The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15378665#comment-15378665 ] Joel Bernstein edited comment on SOLR-8593 at 7/15/16 12:30 AM: The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code} Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. was (Author: joel.bernstein): The CoGroup operator is interesting. Solr's Streaming Expressions provide some similar operations (https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions). CoGroup seems similar to a rollup() function wrapped around a join in Streaming Exrpressions. Psuedo-code: {code} rollup(sum(x), over="y", innerJoin(on="a=b", search(...), search(...))) {code) Our basic strategy is to build up a Streaming Expressions that implements Calcites relational algebra. We've done this already with the Presto parser, but without joins. Looking forward to diving deeper into Calcite. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276476#comment-15276476 ] Joel Bernstein edited comment on SOLR-8593 at 5/9/16 3:09 PM: -- I think my main concern is that the join pushdown rules haven't been exercised that much in production. Also do we have access to the SQL catalog and statistics from inside the rules engine. We'll need that information to decide which type of join to do. I guess we can make the catalog globally accessible if the rules API doesn't provide hooks into it. was (Author: joel.bernstein): I think my main concern is that the join pushdown rules haven't been exercised that much in production. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15276420#comment-15276420 ] Joel Bernstein edited comment on SOLR-8593 at 5/9/16 2:25 PM: -- [~risdenk], I've been reviewing the work on https://github.com/risdenk/solr-calcite-example. I think I understand the basics of how this works but there are some mysteries as to how exactly the rules get triggered and where to place rules like join push downs. Do you know of an existing adapter that does join push downs that can be used as a reference implementation? And what's the best source of documentation for implementing rules? I haven't found anything that I would describes as comprehensive. was (Author: joel.bernstein): [~risdenk], I've been reviewing the work on https://github.com/risdenk/solr-calcite-example. I think I understand the basics of how this works but there are some mysteries as to how exactly the rules get triggered and where to place rules like join push downs. Do you know of an existing adapter that does join push downs that can be used a reference implementation? And what's the best source of documentation for implementing rules? I haven't found anything that I would describes as comprehensive. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895 ] Kevin Risden edited comment on SOLR-8593 at 4/29/16 2:28 PM: - Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * -figure out avg(int) problem in tests.- ** -avg(int) returns int by design. need to figure out if casting is right for the tests- * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. * handle added dependencies properly -and upgrade to latest Calcite/Avatica- was (Author: risdenk): Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * figure out avg(int) problem in tests. ** avg(int) returns int by design. need to figure out if casting is right for the tests * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. * handle added dependencies properly -and upgrade to latest Calcite/Avatica- > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15264057#comment-15264057 ] Joel Bernstein edited comment on SOLR-8593 at 4/29/16 1:38 PM: --- This is awesome. I like the idea of making this a 6.2 priority. I should have some time over the next couple of days to dig into the implementation. was (Author: joel.bernstein): This is awesome. I like the idea of making this a 6.2 priority. I should have some time over the next couple of days dig into the implementation. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895 ] Kevin Risden edited comment on SOLR-8593 at 4/29/16 1:23 AM: - Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * figure out avg(int) problem in tests. ** avg(int) returns int by design. need to figure out if casting is right for the tests * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. * handle added dependencies properly -and upgrade to latest Calcite/Avatica- was (Author: risdenk): Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * figure out avg(int) problem in tests. ** avg(int) returns int by design. need to figure out if casting is right for the tests * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. * handle added dependencies properly and upgrade to latest Calcite/Avatica? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895 ] Kevin Risden edited comment on SOLR-8593 at 4/28/16 8:20 PM: - Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * figure out avg(int) problem in tests. ** avg(int) returns int by design. need to figure out if casting is right for the tests * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. * handle added dependencies properly and upgrade to latest Calcite/Avatica? was (Author: risdenk): Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * figure out avg(int) problem in tests. ** avg(int) returns int by design. need to figure out if casting is right for the tests * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262895#comment-15262895 ] Kevin Risden edited comment on SOLR-8593 at 4/28/16 8:15 PM: - Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for aggregationMode (facets and map_reduce) and their parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite * figure out avg(int) problem in tests. ** avg(int) returns int by design. need to figure out if casting is right for the tests * figure out sort asc by default in tests ** this currently doesn't sort properly even though I thought that the right approach was sort on _version_. was (Author: risdenk): Ok made a bunch of progress on the jira/solr-8593 branch: * cleaned up the tests so that there are only a few remaining items to address (outlined below) * added support for float/double types * fixed a CloudSolrClient resource leak Left to do: * Add support for facets and map_reduce as parameters * ensure the pushdown to facets/map_reduce works correctly * figure out the CloudSolrClient cache (currently not caching and creating new per stream) * Push down aggregates to Solr * add tests to ensure the proper plan is being generated by Calcite > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262361#comment-15262361 ] Kevin Risden edited comment on SOLR-8593 at 4/28/16 3:46 PM: - Pushed initial version to branch jira/solr-8593. Currently there are a lot of tests in TestSQLHandler that are commented out. Need to go back through and make sure they have the expected results. Also some file formatting related to headers needs to be addressed. was (Author: risdenk): Pushed initial version to branch jira/solr-8593. Currently there are a lot of tests in SQLHandler that are commented out. Need to go back through and make sure they have the expected results. Also some file formatting related to headers needs to be addressed. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208874#comment-15208874 ] Kevin Risden edited comment on SOLR-8593 at 3/23/16 5:57 PM: - Another thing I learned from looking closely at the TestSQLHandler tests is that collection name is case insensitive with the current implementation. This seems wrong to me because collections are case sensitive. This is tested in the testMixedCaseFields method. was (Author: risdenk): Another thing I learned from looking closely at the TestSQLHandler tests is that collection name is case insensitive with the current implementation. This seems wrong to me because collections are case sensitive. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207685#comment-15207685 ] Joel Bernstein edited comment on SOLR-8593 at 3/23/16 1:29 AM: --- Currently quoted identifiers do refer to columns. This was originally done because Presto didn't support mixed case columns unless they were quoted. But Presto fixed that problem. So the quoted identifiers as they are now don't really serve a purpose. But I do believe that both Presto and Calcite allow columns with quoted identifiers to support non parseable identifiers. was (Author: joel.bernstein): Currently quoted identifiers do refer to columns. This was originally done because Presto didn't support mixed case columns unless they were quoted. But Presto fixed that problem. So the quoted identifiers as they are now don't really serve a purpose. But I do believe that both Presto and Calcite allow columns for quoted identifiers to support non parseable identifiers. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207593#comment-15207593 ] Joel Bernstein edited comment on SOLR-8593 at 3/23/16 12:25 AM: I looked through the code and I'm seeing how CloudSolrStream is being used. But it's not clear to me we'll be able to implement the full range of capabilities through this approach. For example: 1) Can we choose between the FacetStream and a parallel RollupStream based on the costs of the different approaches? 2) Can we do parallel joins using Solr's shuffling capabilities and Solr workers? was (Author: joel.bernstein): I looked through the code and I'm seeing how CloudSolrStream is being used. But it's not clear to me we'll be able to implement that full range of capabilities through this approach. For example: 1) Can we choose between the FacetStream and a parallel RollupStream based on the costs of the different approaches? 2) Can we do parallel joins using Solr's shuffling capabilities and Solr workers? > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206919#comment-15206919 ] Kevin Risden edited comment on SOLR-8593 at 3/22/16 6:08 PM: - Here is a separate approach that uses all of Calcite and the JDBCStream: https://github.com/risdenk/lucene-solr/compare/master...risdenk:calcite-sql-handler It removes all the custom processing from SQLHandler, wraps Calcite in a JDBCStream, and executes the query. There is something I learned about TestSQLHandler that I'm not sure is correct: * quoted identifiers - like 'id' and 'text' aren't valid? These shouldn't be referring to columns? Things to be explored with this approach: * switch from a standard query in SolrEnumerator to a stream * fix data types * optimize cases like where * code cleanup since it was just thrown together as a POC was (Author: risdenk): Here is a separate approach that uses all of Calcite and the JDBCStream: https://github.com/risdenk/lucene-solr/compare/master...risdenk:calcite-sql-handler It removes all the custom processing from SQLHandler, wraps Calcite in a JDBCStream, and executes the query. There is something I learned about TestSQLHandler that I'm not sure is correct: * quoted identifiers - like 'id' and 'text' aren't valid? These shouldn't be referring to columns? Things to be explored with this approach: * switch from a standard query in SolrEnumerator to a stream * fix data types * optimize cases like where > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-8593) Integrate Apache Calcite into the SQLHandler
[ https://issues.apache.org/jira/browse/SOLR-8593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206730#comment-15206730 ] Joel Bernstein edited comment on SOLR-8593 at 3/22/16 4:38 PM: --- [~risdenk] and I have been looking into different approaches for this ticket. One of the approaches is to embed the Calcite SQL parser and optimizer inside the SQLHandler. The entry point for this appears to be: https://calcite.apache.org/apidocs/org/apache/calcite/tools/Planner.html Using this approach we would need to implement two things: 1) A CatalogReader, which the calcite validator and optimizer will use to do it's job. The underlying implementation of this should work for the JDBC driver also, so we kill two big birds with one stone when this is implemented. 2) A custom RelVisitor, which will rewrite the relational algebra tree (RelNode), created by the optimizer. The RelNode tree will need to be mapped to the Streaming API. Since the Streaming API already supports parallel relational algebra this should be fairly straight forward. This approach would leave the Solr JDBC driver basically as it is, but provide all the hooks needed to finish off the remaining Catalog metadata methods. was (Author: joel.bernstein): [~risdenk] and I have been looking into different approaches for this ticket. One of the approaches is to embed the Calcite SQL parser and optimizer inside the SQLHandler. The entry point for this appears to be: https://calcite.apache.org/apidocs/org/apache/calcite/tools/Planner.html Using this approach we would need to implement two things: 1) A CatalogReader, which the calcite validator and optimizer will use to do it's job. The underlying implementation of this should work for the JDBC driver also, so we kill two big birds with one stone when this implemented. 2) A custom RelVisitor, which will rewrite the relational algebra tree (RelNode), created by the optimizer. The RelNode tree will need to be mapped to the Streaming API. Since the Streaming API already supports parallel relational algebra this should be fairly straight forward. This approach would leave the Solr JDBC driver basically as it is, but provide all the hooks needed to finish off the remaining Catalog metadata methods. > Integrate Apache Calcite into the SQLHandler > > > Key: SOLR-8593 > URL: https://issues.apache.org/jira/browse/SOLR-8593 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein > Fix For: master > > > The Presto SQL Parser was perfect for phase one of the SQLHandler. It was > nicely split off from the larger Presto project and it did everything that > was needed for the initial implementation. > Phase two of the SQL work though will require an optimizer. Here is where > Apache Calcite comes into play. It has a battle tested cost based optimizer > and has been integrated into Apache Drill and Hive. > This work can begin in trunk following the 6.0 release. The final query plans > will continue to be translated to Streaming API objects (TupleStreams), so > continued work on the JDBC driver should plug in nicely with the Calcite work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org