This is an automated email from the ASF dual-hosted git repository.
lirui pushed a commit to branch release-1.13
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/release-1.13 by this push:
new 06f233e [FLINK-22119][hive][doc] Update document for hive dialect
06f233e is described below
commit 06f233ea2cfefd3eaab0b886e058076ec51974a4
Author: Rui Li <[email protected]>
AuthorDate: Sun Apr 25 10:32:53 2021 +0800
[FLINK-22119][hive][doc] Update document for hive dialect
---
.../docs/connectors/table/hive/hive_dialect.md | 81 ++++++++++++++++---
.../docs/connectors/table/hive/overview.md | 24 ++++++
.../docs/connectors/table/hive/hive_dialect.md | 93 +++++++++++++++++-----
.../content/docs/connectors/table/hive/overview.md | 24 ++++++
4 files changed, 192 insertions(+), 30 deletions(-)
diff --git a/docs/content.zh/docs/connectors/table/hive/hive_dialect.md
b/docs/content.zh/docs/connectors/table/hive/hive_dialect.md
index c705544..9840494 100644
--- a/docs/content.zh/docs/connectors/table/hive/hive_dialect.md
+++ b/docs/content.zh/docs/connectors/table/hive/hive_dialect.md
@@ -335,26 +335,85 @@ CREATE FUNCTION function_name AS class_name;
DROP FUNCTION [IF EXISTS] function_name;
```
-## DML
+## DML & DQL _`Beta`_
-### INSERT
+Hive 方言支持常用的 Hive
[DML](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML)
+和
[DQL](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select) 。
下表列出了一些 Hive 方言支持的语法。
-```sql
-INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT
...;
-```
+- [SORT/CLUSTER/DISTRIBUTE
BY](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy)
+- [Group
By](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy)
+- [Join](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins)
+-
[Union](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union)
+- [LATERAL
VIEW](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView)
+- [Window
Functions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics)
+-
[SubQueries](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries)
+-
[CTE](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression)
+- [INSERT INTO dest schema](https://issues.apache.org/jira/browse/HIVE-9481)
+- [Implicit type
conversions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions)
+
+为了实现更好的语法和语义的兼容,强烈建议使用 [HiveModule]({{< ref
"docs/connectors/table/hive/hive_functions"
>}}#use-hive-built-in-functions-via-hivemodule)
+并将其放在 Module 列表的首位,以便在函数解析时优先使用 Hive 内置函数。
-如果指定了 `partition_spec`,可以是完整或者部分分区列。如果是部分指定,则可以省略动态分区的列名。
+Hive 方言不再支持 [Flink SQL 语法]({{< ref "docs/dev/table/sql/queries" >}}) 。 若需使用
Flink 语法,请切换到 `default` 方言。
+
+以下是一个使用 Hive 方言的示例。
+
+```bash
+Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' =
'/opt/hive-conf');
+[INFO] Execute statement succeed.
-## DQL
+Flink SQL> use catalog myhive;
+[INFO] Execute statement succeed.
-目前,对于DQL语句 Hive 方言和 Flink SQL 支持的语法相同。有关更多详细信息,请参考[Flink SQL 查询]({{< ref
"docs/dev/table/sql/queries" >}})。并且建议切换到 `default` 方言来执行 DQL 语句。
+Flink SQL> load module hive;
+[INFO] Execute statement succeed.
+
+Flink SQL> use modules hive,core;
+[INFO] Execute statement succeed.
+
+Flink SQL> set table.sql-dialect=hive;
+[INFO] Session property has been set.
+
+Flink SQL> select explode(array(1,2,3)); -- call hive udtf
++-----+
+| col |
++-----+
+| 1 |
+| 2 |
+| 3 |
++-----+
+3 rows in set
+
+Flink SQL> create table tbl (key int,value string);
+[INFO] Execute statement succeed.
+
+Flink SQL> insert overwrite table tbl values
(5,'e'),(1,'a'),(1,'a'),(3,'c'),(2,'b'),(3,'c'),(3,'c'),(4,'d');
+[INFO] Submitting SQL update statement to the cluster...
+[INFO] SQL update statement has been successfully submitted to the cluster:
+
+Flink SQL> select * from tbl cluster by key; -- run cluster by
+2021-04-22 16:13:57,005 INFO org.apache.hadoop.mapred.FileInputFormat
[] - Total input paths to process : 1
++-----+-------+
+| key | value |
++-----+-------+
+| 1 | a |
+| 1 | a |
+| 5 | e |
+| 2 | b |
+| 3 | c |
+| 3 | c |
+| 3 | c |
+| 4 | d |
++-----+-------+
+8 rows in set
+```
## 注意
以下是使用 Hive 方言的一些注意事项。
-- Hive 方言只能用于操作 Hive 表,不能用于一般表。Hive 方言应与[HiveCatalog]({{< ref
"docs/connectors/table/hive/hive_catalog" >}})一起使用。
+- Hive 方言只能用于操作 Hive 对象,并要求当前 Catalog 是一个 [HiveCatalog]({{< ref
"docs/connectors/table/hive/hive_catalog" >}}) 。
+- Hive 方言只支持 `db.table` 这种两级的标识符,不支持带有 Catalog 名字的标识符。
- 虽然所有 Hive 版本支持相同的语法,但是一些特定的功能是否可用仍取决于你使用的[Hive 版本]({{< ref
"docs/connectors/table/hive/overview" >}}#支持的hive版本)。例如,更新数据库位置
只在 Hive-2.4.0 或更高版本支持。
-- Hive 和 Calcite 有不同的保留关键字集合。例如,`default` 是 Calcite 的保留关键字,却不是 Hive
的保留关键字。即使使用 Hive 方言, 也必须使用反引号 ( ` ) 引用此类关键字才能将其用作标识符。
-- 由于扩展的查询语句的不兼容性,在 Flink 中创建的视图是不能在 Hive 中查询的。
+- 执行 DML 和 DQL 时应该使用 [HiveModule]({{< ref
"docs/connectors/table/hive/hive_functions"
>}}#use-hive-built-in-functions-via-hivemodule) 。
diff --git a/docs/content.zh/docs/connectors/table/hive/overview.md
b/docs/content.zh/docs/connectors/table/hive/overview.md
index 1f43865..a32a3ff 100644
--- a/docs/content.zh/docs/connectors/table/hive/overview.md
+++ b/docs/content.zh/docs/connectors/table/hive/overview.md
@@ -127,6 +127,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
// Hive dependencies
hive-exec-2.3.4.jar
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 1.0.0" >}}
@@ -146,6 +149,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
orc-core-1.4.3-nohive.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 1.1.0" >}}
@@ -165,6 +171,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
orc-core-1.4.3-nohive.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 1.2.1" >}}
@@ -184,6 +193,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
orc-core-1.4.3-nohive.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 2.0.0" >}}
@@ -197,6 +209,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
// Hive dependencies
hive-exec-2.0.0.jar
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 2.1.0" >}}
@@ -210,6 +225,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
// Hive dependencies
hive-exec-2.1.0.jar
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 2.2.0" >}}
@@ -227,6 +245,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
orc-core-1.4.3.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 3.1.0" >}}
@@ -241,6 +262,9 @@ export HADOOP_CLASSPATH=`hadoop classpath`
hive-exec-3.1.0.jar
libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some
versions, need to add it separately
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< /tabs >}}
diff --git a/docs/content/docs/connectors/table/hive/hive_dialect.md
b/docs/content/docs/connectors/table/hive/hive_dialect.md
index a1c71fd..47384b5 100644
--- a/docs/content/docs/connectors/table/hive/hive_dialect.md
+++ b/docs/content/docs/connectors/table/hive/hive_dialect.md
@@ -300,8 +300,6 @@ CREATE VIEW [IF NOT EXISTS] view_name [(column_name, ...) ]
#### Alter
-**NOTE**: Altering view only works in Table API, but not supported via SQL
client.
-
##### Rename
```sql
@@ -346,33 +344,90 @@ CREATE FUNCTION function_name AS class_name;
DROP FUNCTION [IF EXISTS] function_name;
```
-## DML
+## DML & DQL _`Beta`_
-### INSERT
+Hive dialect supports a commonly-used subset of Hive's
[DML](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML)
+and
[DQL](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select).
The following lists some examples of
+HiveQL supported by the Hive dialect.
-```sql
-INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT
...;
-```
+- [SORT/CLUSTER/DISTRIBUTE
BY](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy)
+- [Group
By](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy)
+- [Join](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins)
+-
[Union](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union)
+- [LATERAL
VIEW](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView)
+- [Window
Functions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics)
+-
[SubQueries](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries)
+-
[CTE](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression)
+- [INSERT INTO dest schema](https://issues.apache.org/jira/browse/HIVE-9481)
+- [Implicit type
conversions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions)
+
+In order to have better syntax and semantic compatibility, it's highly
recommended to use [HiveModule]({{< ref
"docs/connectors/table/hive/hive_functions"
>}}#use-hive-built-in-functions-via-hivemodule)
+and place it first in the module list, so that Hive built-in functions can be
picked up during function resolution.
+
+Hive dialect no longer supports [Flink SQL queries]({{< ref
"docs/dev/table/sql/queries" >}}). Please switch to `default`
+dialect if you'd like to write in Flink syntax.
+
+Following is an example of using hive dialect to run some queries.
-The `partition_spec`, if present, can be either a full spec or partial spec.
If the `partition_spec` is a partial
-spec, the dynamic partition column names can be omitted.
+```bash
+Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' =
'/opt/hive-conf');
+[INFO] Execute statement succeed.
+
+Flink SQL> use catalog myhive;
+[INFO] Execute statement succeed.
+
+Flink SQL> load module hive;
+[INFO] Execute statement succeed.
-## DQL
+Flink SQL> use modules hive,core;
+[INFO] Execute statement succeed.
-At the moment, Hive dialect supports the same syntax as Flink SQL for DQLs.
Refer to
-[Flink SQL queries]({{< ref "docs/dev/table/sql/queries" >}}) for more
details. And it's recommended to switch to
-`default` dialect to execute DQLs.
+Flink SQL> set table.sql-dialect=hive;
+[INFO] Session property has been set.
+
+Flink SQL> select explode(array(1,2,3)); -- call hive udtf
++-----+
+| col |
++-----+
+| 1 |
+| 2 |
+| 3 |
++-----+
+3 rows in set
+
+Flink SQL> create table tbl (key int,value string);
+[INFO] Execute statement succeed.
+
+Flink SQL> insert overwrite table tbl values
(5,'e'),(1,'a'),(1,'a'),(3,'c'),(2,'b'),(3,'c'),(3,'c'),(4,'d');
+[INFO] Submitting SQL update statement to the cluster...
+[INFO] SQL update statement has been successfully submitted to the cluster:
+
+Flink SQL> select * from tbl cluster by key; -- run cluster by
+2021-04-22 16:13:57,005 INFO org.apache.hadoop.mapred.FileInputFormat
[] - Total input paths to process : 1
++-----+-------+
+| key | value |
++-----+-------+
+| 1 | a |
+| 1 | a |
+| 5 | e |
+| 2 | b |
+| 3 | c |
+| 3 | c |
+| 3 | c |
+| 4 | d |
++-----+-------+
+8 rows in set
+```
## Notice
The following are some precautions for using the Hive dialect.
-- Hive dialect should only be used to manipulate Hive tables, not generic
tables. And Hive dialect should be used together
-with a [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}).
+- Hive dialect should only be used to process Hive meta objects, and requires
the current catalog to be a
+[HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}).
+- Hive dialect only supports 2-part identifiers, so you can't specify catalog
for an identifier.
- While all Hive versions support the same syntax, whether a specific feature
is available still depends on the
[Hive version]({{< ref "docs/connectors/table/hive/overview"
>}}#supported-hive-versions) you use. For example, updating database
location is only supported in Hive-2.4.0 or later.
-- Hive and Calcite have different sets of reserved keywords. For example,
`default` is a reserved keyword in Calcite and
-a non-reserved keyword in Hive. Even with Hive dialect, you have to quote such
keywords with backtick ( ` ) in order to
-use them as identifiers.
-- Due to expanded query incompatibility, views created in Flink cannot be
queried in Hive.
+- Use [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions"
>}}#use-hive-built-in-functions-via-hivemodule)
+to run DML and DQL.
diff --git a/docs/content/docs/connectors/table/hive/overview.md
b/docs/content/docs/connectors/table/hive/overview.md
index dd3e21a..e8956ec 100644
--- a/docs/content/docs/connectors/table/hive/overview.md
+++ b/docs/content/docs/connectors/table/hive/overview.md
@@ -131,6 +131,9 @@ Please find the required dependencies for different Hive
major versions below.
// Hive dependencies
hive-exec-2.3.4.jar
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 1.0.0" >}}
@@ -150,6 +153,9 @@ Please find the required dependencies for different Hive
major versions below.
orc-core-1.4.3-nohive.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 1.1.0" >}}
@@ -169,6 +175,9 @@ Please find the required dependencies for different Hive
major versions below.
orc-core-1.4.3-nohive.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 1.2.1" >}}
@@ -188,6 +197,9 @@ Please find the required dependencies for different Hive
major versions below.
orc-core-1.4.3-nohive.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 2.0.0" >}}
@@ -201,6 +213,9 @@ Please find the required dependencies for different Hive
major versions below.
// Hive dependencies
hive-exec-2.0.0.jar
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 2.1.0" >}}
@@ -214,6 +229,9 @@ Please find the required dependencies for different Hive
major versions below.
// Hive dependencies
hive-exec-2.1.0.jar
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 2.2.0" >}}
@@ -231,6 +249,9 @@ Please find the required dependencies for different Hive
major versions below.
orc-core-1.4.3.jar
aircompressor-0.8.jar // transitive dependency of orc-core
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< tab "Hive 3.1.0" >}}
@@ -245,6 +266,9 @@ Please find the required dependencies for different Hive
major versions below.
hive-exec-3.1.0.jar
libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some
versions, need to add it separately
+ // add antlr-runtime if you need to use hive dialect
+ antlr-runtime-3.5.2.jar
+
```
{{< /tab >}}
{{< /tabs >}}