[jira] [Updated] (SPARK-11012) Canonicalize view definitions
[ https://issues.apache.org/jira/browse/SPARK-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-11012: -- Assignee: Yin Huai > Canonicalize view definitions > - > > Key: SPARK-11012 > URL: https://issues.apache.org/jira/browse/SPARK-11012 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.0.0 >Reporter: Yin Huai >Assignee: Yin Huai > Fix For: 2.0.0 > > > In SPARK-10337, we added the first step of supporting view natively, which is > basically wrapping the original view definition SQL text with an extra > {{SELECT}} and then store the wrapped SQL text into metastore. This approach > suffers at least two issues: > # Switching current database may break view queries > # HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view > definition > To fix these issues, we need to canonicalize the view definition. For > example, for a SQL string > {code:sql} > SELECT a, b FROM table > {code} > we will save this text to Hive metastore as > {code:sql} > SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table` > {code} > The core infrastructure of this work is SQL query string generation > (SPARK-12593). Namely, converting resolved logical query plans back to > canonicalized SQL query strings. [PR > #10541|https://github.com/apache/spark/pull/10541] set up basic > infrastructure of SQL generation, but more language structures need to be > supported. > [PR #10541|https://github.com/apache/spark/pull/10541] added round-trip > testing infrastructure for SQL generation. All queries tested by test suites > extending {{HiveComparisonTest}} are executed in the following order: > # Parsing query string to logical plan > # Converting resolved logical plan back to canonicalized SQL query string > # Executing generated SQL query string > # Comparing query results with golden answers > Note that not all resolved logical query plan can be converted back to SQL > query string. Either because it consists of some language structure that has > not been supported yet, or it doesn't have a SQL representation inherently > (e.g. query plans built on top of local Scala collections). > If a logical plan is inconvertible, {{HiveComparisonTest}} falls back to its > original behavior, namely executing the original SQL query string and compare > the results with golden answers. > SQL generation details are logged and can be found in > {{sql/hive/target/unit-tests.log}} (log level should be at least DEBUG). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11012) Canonicalize view definitions
[ https://issues.apache.org/jira/browse/SPARK-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-11012: --- Description: In SPARK-10337, we added the first step of supporting view natively, which is basically wrapping the original view definition SQL text with an extra {{SELECT}} and then store the wrapped SQL text into metastore. This approach suffers at least two issues: # Switching current database may break view queries # HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view definition To fix these issues, we need to canonicalize the view definition. For example, for a SQL string {code:sql} SELECT a, b FROM table {code} we will save this text to Hive metastore as {code:sql} SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table` {code} The core infrastructure of this work is SQL query string generation (SPARK-12593). Namely, converting resolved logical query plans back to canonicalized SQL query strings. [PR #10541|https://github.com/apache/spark/pull/10541] set up basic infrastructure of SQL generation, but more language structures need to be supported. [PR #10541|https://github.com/apache/spark/pull/10541] added round-trip testing infrastructure for SQL generation. All queries tested by test suites extending {{HiveComparisonTest}} are executed in the following order: # Parsing query string to logical plan # Converting resolved logical plan back to canonicalized SQL query string # Executing generated SQL query string # Comparing query results with golden answers Note that not all resolved logical query plan can be converted back to SQL query string. Either because it consists of some language structure that has not been supported yet, or it doesn't have a SQL representation inherently (e.g. query plans built on top of local Scala collections). If a logical plan is inconvertible, {{HiveComparisonTest}} falls back to its original behavior, namely executing the original SQL query string and compare the results with golden answers. SQL generation details are logged and can be found in {{sql/hive/target/unit-tests.log}} (log level should be at least DEBUG). was: In SPARK-10337, we added the first step of supporting view natively, which is basically wrapping the original view definition SQL text with an extra {{SELECT}} and then store the wrapped SQL text into metastore. This approach suffers at least two issues: # Switching current database may break view queries # HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view definition To fix these issues, we need to canonicalize the view definition. For example, for a SQL string {code:sql} SELECT a, b FROM table {code} we will save this text to Hive metastore as {code:sql} SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table` {code} The core infrastructure of this work is SQL query string generation (SPARK-12593). Namely, converting resolved logical query plans back to canonicalized SQL query strings. [PR #10541|https://github.com/apache/spark/pull/10541] set up basic infrastructure of SQL generation, but more language structures need to be supported. > Canonicalize view definitions > - > > Key: SPARK-11012 > URL: https://issues.apache.org/jira/browse/SPARK-11012 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.0.0 >Reporter: Yin Huai > > In SPARK-10337, we added the first step of supporting view natively, which is > basically wrapping the original view definition SQL text with an extra > {{SELECT}} and then store the wrapped SQL text into metastore. This approach > suffers at least two issues: > # Switching current database may break view queries > # HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view > definition > To fix these issues, we need to canonicalize the view definition. For > example, for a SQL string > {code:sql} > SELECT a, b FROM table > {code} > we will save this text to Hive metastore as > {code:sql} > SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table` > {code} > The core infrastructure of this work is SQL query string generation > (SPARK-12593). Namely, converting resolved logical query plans back to > canonicalized SQL query strings. [PR > #10541|https://github.com/apache/spark/pull/10541] set up basic > infrastructure of SQL generation, but more language structures need to be > supported. > [PR #10541|https://github.com/apache/spark/pull/10541] added round-trip > testing infrastructure for SQL generation. All queries tested by test suites > extending {{HiveComparisonTest}} are executed in the following order: > # Parsing query string to logical plan > # Converting resolved logical plan back to canonicalized SQL query string > # Executing generated SQL query string > # Comparing query results with golden
[jira] [Updated] (SPARK-11012) Canonicalize view definitions
[ https://issues.apache.org/jira/browse/SPARK-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-11012: --- Description: In SPARK-10337, we added the first step of supporting view natively, which is basically wrapping the original view definition SQL text with an extra {{SELECT}} and then store the wrapped SQL text into metastore. This approach suffers at least two issues: # Switching current database may break view queries # HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view definition To fix these issues, we need to canonicalize the view definition. For example, for a SQL string {code:sql} SELECT a, b FROM table {code} we will save this text to Hive metastore as {code:sql} SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table` {code} The core infrastructure of this work is SQL query string generation (SPARK-12593). Namely, converting resolved logical query plans back to canonicalized SQL query strings. [PR #10541|https://github.com/apache/spark/pull/10541] set up basic infrastructure of SQL generation, but more language structures need to be supported. was:In SPARK-10337, we added the first step of supporting view natively. Building on top of that work, we need to canonicalize the view definition. So, for a SQL string SELECT a, b FROM table, we will save this text to Hive metastore as SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table`. > Canonicalize view definitions > - > > Key: SPARK-11012 > URL: https://issues.apache.org/jira/browse/SPARK-11012 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.0.0 >Reporter: Yin Huai > > In SPARK-10337, we added the first step of supporting view natively, which is > basically wrapping the original view definition SQL text with an extra > {{SELECT}} and then store the wrapped SQL text into metastore. This approach > suffers at least two issues: > # Switching current database may break view queries > # HiveQL doesn't allow CTE as subquery, thus CTE can't be used in view > definition > To fix these issues, we need to canonicalize the view definition. For > example, for a SQL string > {code:sql} > SELECT a, b FROM table > {code} > we will save this text to Hive metastore as > {code:sql} > SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table` > {code} > The core infrastructure of this work is SQL query string generation > (SPARK-12593). Namely, converting resolved logical query plans back to > canonicalized SQL query strings. [PR > #10541|https://github.com/apache/spark/pull/10541] set up basic > infrastructure of SQL generation, but more language structures need to be > supported. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11012) Canonicalize view definitions
[ https://issues.apache.org/jira/browse/SPARK-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-11012: Issue Type: New Feature (was: Bug) > Canonicalize view definitions > - > > Key: SPARK-11012 > URL: https://issues.apache.org/jira/browse/SPARK-11012 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.0.0 >Reporter: Yin Huai > > In SPARK-10337, we added the first step of supporting view natively. Building > on top of that work, we need to canonicalize the view definition. So, for a > SQL string SELECT a, b FROM table, we will save this text to Hive metastore > as SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11012) Canonicalize view definitions
[ https://issues.apache.org/jira/browse/SPARK-11012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-11012: --- Affects Version/s: 2.0.0 Target Version/s: 2.0.0 > Canonicalize view definitions > - > > Key: SPARK-11012 > URL: https://issues.apache.org/jira/browse/SPARK-11012 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0 >Reporter: Yin Huai > > In SPARK-10337, we added the first step of supporting view natively. Building > on top of that work, we need to canonicalize the view definition. So, for a > SQL string SELECT a, b FROM table, we will save this text to Hive metastore > as SELECT `table`.`a`, `table`.`b` FROM `currentDB`.`table`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org