[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14202116#comment-14202116 ] Laljo John Pullokkaran commented on HIVE-8561: -- Hive Optiq Rel Node is gonna be removed shortly as part of refactoring; also Optiq/Calcite is going through refactoring/renaming. w.r.t public api, an api to get logical plan with/without optimization may be what you want. Public API needs to address: 1. what sort of queries can this api handle (select, insert in to, create table as)? 2. What about security (is Hive gonna enforce security or is it going to be Drill)? 3. What about views? Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.3.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200562#comment-14200562 ] Brock Noland commented on HIVE-8561: Hi [~nyang], Thank you for the annotations! Since the CBO is such a huge component and in it's infancy, I feel like {{Unstable}} might be more appropriate than {{Evolving}}. However, before making that change I think we should settle with [~jpullokkaran] the correct way to perform this integration. bq. Why can't Drill be plugged in as another execution engine just like MR, TEZ, Spark? [~jpullokkaran] It's reasonable for Drill to add API's in order to use the query plan. The Drill project like many other projects are users of Hive. As mentioned previously, it's important to agree upon some kind of api visibility and stability. Na has agreed to an unstable interface (It is the caller's responsibility to follow the hive side change). As one of the CBO experts, if there is a better an alternative implementation, could you please share how this could be improved? Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.3.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201620#comment-14201620 ] Na Yang commented on HIVE-8561: --- [~brocknoland], thank you very much for the suggestion. I will change the annotation to be Unstable after getting [~jpullokkaran] 's input. [~jpullokkaran], I appreciate if you could give some suggestions on how to let Drill and other projects to be able to use the Hive query plan. Thanks. Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.3.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187069#comment-14187069 ] Brock Noland commented on HIVE-8561: I am not familiar with the COB code so it's possible we want to take a different approach. However, in principle I don't see an issue with the idea of this change. Additionally, any API that we open to limited group of downstream projects, such as Drill, should be marked with annotations. For example {{LimitedPrivate(Apache Hive, Apache Drill (Incubating))}} and then either {{Unstable}} or {{Evolving}}. https://github.com/apache/hive/blob/trunk/common/src/java/org/apache/hadoop/hive/common/classification/InterfaceAudience.java#L34 https://github.com/apache/hive/blob/trunk/common/src/java/org/apache/hadoop/hive/common/classification/InterfaceStability.java#L34 We should do that for any change which opens API's for Drill. Additionally, I see we marked some members from java private to java public. As opposed to doing that we should add getters for those member variables. Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187125#comment-14187125 ] Na Yang commented on HIVE-8561: --- [~brocknoland], thank you for the suggestion. If the new public APIs are acceptable, I will made the change as you suggested and upload a new patch. Thanks. Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187582#comment-14187582 ] Hive QA commented on HIVE-8561: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12677647/HIVE-8561.3.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6579 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_optimize_nullscan org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1508/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1508/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1508/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12677647 - PreCommit-HIVE-TRUNK-Build Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.3.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184103#comment-14184103 ] Hive QA commented on HIVE-8561: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12677045/HIVE-8561.2.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6578 tests executed *Failed tests:* {noformat} org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1457/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1457/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1457/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12677045 - PreCommit-HIVE-TRUNK-Build Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183151#comment-14183151 ] Laljo John Pullokkaran commented on HIVE-8561: -- Na Yang, If i understand correctly, goal of this patch is to use Hive for query parsing, resolving, cost based optimization and use Drill as the execution engine. If my guess is right this patch makes Hive's Optiq Op tree a public interface. The Hive's Optiq Op tree is not meant to be a public interface and it would go through many changes as we add more to CBO support for more operators. Why can't Drill be plugged in as another execution engine just like MR, TEZ, Spark? Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183700#comment-14183700 ] Na Yang commented on HIVE-8561: --- Hi [~xuefuz] and [~jpullokkaran], thank you both very much for reviewing this patch. I understand your concern of adding a public API to expose Optiq Op tree to outside world. To address the risk of calling this API outside, I added warning on the Java doc. It is the caller's responsibility to follow the hive side change if they want to use this API to get the Hive's Optiq Op tree. [~jpullokkaran], your understanding is correct. The goal of this patch is to use Hive for query parsing and use Drill as the execution engine. Since hive has already generated the Optiq Op tree, it is possible for drill to use it directly. Thanks Regards, Na Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.2.patch, HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180756#comment-14180756 ] Hive QA commented on HIVE-8561: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12676411/HIVE-8561.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 6575 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1399/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1399/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1399/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12676411 - PreCommit-HIVE-TRUNK-Build Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180921#comment-14180921 ] Xuefu Zhang commented on HIVE-8561: --- +1 Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180924#comment-14180924 ] Sergey Shelukhin commented on HIVE-8561: [~jpullokkaran] do you want to also take a look? Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180927#comment-14180927 ] Xuefu Zhang commented on HIVE-8561: --- Sorry. +1ed the wrong JIRA. I didn't review, but I'd like to if needed. Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180978#comment-14180978 ] Na Yang commented on HIVE-8561: --- Hi [~xuefuz], can you please help review this patch? This patch is not fixing a hive bug, but rather providing support for other sql on hadoop query engine such as apache Drill. Thanks Regards, Na Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8561) Expose Hive optiq operator tree to be able to support other sql on hadoop query engines
[ https://issues.apache.org/jira/browse/HIVE-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181032#comment-14181032 ] Xuefu Zhang commented on HIVE-8561: --- Hi @Na Yang, I took a brief look. With my limitted understanding of CBO code, the patch looks fine. My only concern is that this seems expose some Hive's internals to the outside world. If Hive make any changes on this, it might inadvertantly break other applications. What's your thought on this? Is there a different approach that achieves the same purpose, or has Hive exposed anything similar to this? Expose Hive optiq operator tree to be able to support other sql on hadoop query engines --- Key: HIVE-8561 URL: https://issues.apache.org/jira/browse/HIVE-8561 Project: Hive Issue Type: Task Components: CBO Affects Versions: 0.14.0 Reporter: Na Yang Assignee: Na Yang Attachments: HIVE-8561.patch Hive-0.14 added cost based optimization and optiq operator tree is created for select queries. However, the optiq operator tree is not visible from outside and hard to be used by other Sql on Hadoop query engine such as apache Drill. To be able to allow drill to access the hive optiq operator tree, we need to add a public api to return the hive optiq operator tree. -- This message was sent by Atlassian JIRA (v6.3.4#6332)