[incubator-hudi] branch asf-site updated: [MINOR] Update doc to include inc query on partitions (#1454)

leesf Sat, 28 Mar 2020 20:30:00 -0700

This is an automated email from the ASF dual-hosted git repository.

leesf pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/incubator-hudi.git



The following commit(s) were added to refs/heads/asf-site by this push:
     new 18ce570  [MINOR] Update doc to include inc query on partitions (#1454)
18ce570 is described below

commit 18ce5708e073e80779f6dcc00d388b4cb0cc758a
Author: YanJia-Gary-Li <yanjia.gary...@gmail.com>
AuthorDate: Sat Mar 28 20:28:48 2020 -0700

    [MINOR] Update doc to include inc query on partitions (#1454)
---
 docs/_docs/0.5.2/2_3_querying_data.cn.md | 31 ++++++++++++++++++++++++++++++-
 docs/_docs/0.5.2/2_3_querying_data.md    |  3 ++-
 docs/_docs/2_3_querying_data.cn.md       | 31 ++++++++++++++++++++++++++++++-
 docs/_docs/2_3_querying_data.md          |  3 ++-
 4 files changed, 64 insertions(+), 4 deletions(-)

diff --git a/docs/_docs/0.5.2/2_3_querying_data.cn.md 
b/docs/_docs/0.5.2/2_3_querying_data.cn.md
index 74afcef..77ad2d7 100644
--- a/docs/_docs/0.5.2/2_3_querying_data.cn.md
+++ b/docs/_docs/0.5.2/2_3_querying_data.cn.md
@@ -25,6 +25,33 @@ language: cn
 
并与其他表（数据集/维度）结合以[写出增量](/cn/docs/0.5.2-writing_data.html)到目标Hudi数据集。增量视图是通过查询上表之一实现的，并具有特殊配置，
 该特殊配置指示查询计划仅需要从数据集中获取增量数据。
 
+
+## 查询引擎支持列表
+
+下面的表格展示了各查询引擎是否支持Hudi格式
+
+### 读优化表
+  
+|查询引擎|实时视图|增量拉取|
+|------------|--------|-----------|
+|**Hive**|Y|Y|
+|**Spark SQL**|Y|Y|
+|**Spark Datasource**|Y|Y|
+|**Presto**|Y|N|
+|**Impala**|Y|N|
+
+
+### 实时表
+
+|查询引擎|实时视图|增量拉取|读优化表|
+|------------|--------|-----------|--------------|
+|**Hive**|Y|Y|Y|
+|**Spark SQL**|Y|Y|Y|
+|**Spark Datasource**|N|N|Y|
+|**Presto**|N|N|Y|
+|**Impala**|N|N|Y|
+
+
 接下来，我们将详细讨论在每个查询引擎上如何访问所有三个视图。
 
 ## Hive
@@ -128,7 +155,9 @@ scala> sqlContext.sql("select count(*) from hudi_rt where 
datestr = '2016-10-02'
              DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL())
      .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY(),
             <beginInstantTime>)
-     .load(tablePath); // For incremental view, pass in the root/base path of 
dataset
+     .option(DataSourceReadOptions.INCR_PATH_GLOB_OPT_KEY(),
+            "/year=2020/month=*/day=*") // 可选，从指定的分区增量拉取
+     .load(tablePath); // 用数据集的最底层路径
 ```
 
 请参阅[设置](/cn/docs/0.5.2-configurations.html#spark-datasource)部分，以查看所有数据源选项。
diff --git a/docs/_docs/0.5.2/2_3_querying_data.md 
b/docs/_docs/0.5.2/2_3_querying_data.md
index 0c28b12..9d17e72 100644
--- a/docs/_docs/0.5.2/2_3_querying_data.md
+++ b/docs/_docs/0.5.2/2_3_querying_data.md
@@ -55,7 +55,7 @@ Note that `Read Optimized` queries are not applicable for 
COPY_ON_WRITE tables.
 |**Spark SQL**|Y|Y|Y|
 |**Spark Datasource**|N|N|Y|
 |**Presto**|N|N|Y|
-|**Impala**|N|N|N|
+|**Impala**|N|N|Y|
 
 
 In sections, below we will discuss specific setup to access different query 
types from different query engines. 
@@ -148,6 +148,7 @@ The following snippet shows how to obtain all records 
changed after `beginInstan
      .format("org.apache.hudi")
      .option(DataSourceReadOptions.QUERY_TYPE_OPT_KEY(), 
DataSourceReadOptions.QUERY_TYPE_INCREMENTAL_OPT_VAL())
      .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY(), 
<beginInstantTime>)
+     .option(DataSourceReadOptions.INCR_PATH_GLOB_OPT_KEY(), 
"/year=2020/month=*/day=*") // Optional, use glob pattern if querying certain 
partitions
      .load(tablePath); // For incremental query, pass in the root/base path of 
table
      
 hudiIncQueryDF.createOrReplaceTempView("hudi_trips_incremental")
diff --git a/docs/_docs/2_3_querying_data.cn.md 
b/docs/_docs/2_3_querying_data.cn.md
index b2c4870..1fa91d1 100644
--- a/docs/_docs/2_3_querying_data.cn.md
+++ b/docs/_docs/2_3_querying_data.cn.md
@@ -24,6 +24,33 @@ language: cn
 
并与其他表（数据集/维度）结合以[写出增量](/cn/docs/writing_data.html)到目标Hudi数据集。增量视图是通过查询上表之一实现的，并具有特殊配置，
 该特殊配置指示查询计划仅需要从数据集中获取增量数据。
 
+
+## 查询引擎支持列表
+
+下面的表格展示了各查询引擎是否支持Hudi格式
+
+### 读优化表
+  
+|查询引擎|实时视图|增量拉取|
+|------------|--------|-----------|
+|**Hive**|Y|Y|
+|**Spark SQL**|Y|Y|
+|**Spark Datasource**|Y|Y|
+|**Presto**|Y|N|
+|**Impala**|Y|N|
+
+
+### 实时表
+
+|查询引擎|实时视图|增量拉取|读优化表|
+|------------|--------|-----------|--------------|
+|**Hive**|Y|Y|Y|
+|**Spark SQL**|Y|Y|Y|
+|**Spark Datasource**|N|N|Y|
+|**Presto**|N|N|Y|
+|**Impala**|N|N|Y|
+
+
 接下来，我们将详细讨论在每个查询引擎上如何访问所有三个视图。
 
 ## Hive
@@ -127,7 +154,9 @@ scala> sqlContext.sql("select count(*) from hudi_rt where 
datestr = '2016-10-02'
              DataSourceReadOptions.VIEW_TYPE_INCREMENTAL_OPT_VAL())
      .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY(),
             <beginInstantTime>)
-     .load(tablePath); // For incremental view, pass in the root/base path of 
dataset
+     .option(DataSourceReadOptions.INCR_PATH_GLOB_OPT_KEY(),
+            "/year=2020/month=*/day=*") // 可选，从指定的分区增量拉取
+     .load(tablePath); // 用数据集的最底层路径
 ```
 
 请参阅[设置](/cn/docs/configurations.html#spark-datasource)部分，以查看所有数据源选项。
diff --git a/docs/_docs/2_3_querying_data.md b/docs/_docs/2_3_querying_data.md
index 875b7f0..3e6a436 100644
--- a/docs/_docs/2_3_querying_data.md
+++ b/docs/_docs/2_3_querying_data.md
@@ -54,7 +54,7 @@ Note that `Read Optimized` queries are not applicable for 
COPY_ON_WRITE tables.
 |**Spark SQL**|Y|Y|Y|
 |**Spark Datasource**|N|N|Y|
 |**Presto**|N|N|Y|
-|**Impala**|N|N|N|
+|**Impala**|N|N|Y|
 
 
 In sections, below we will discuss specific setup to access different query 
types from different query engines. 
@@ -147,6 +147,7 @@ The following snippet shows how to obtain all records 
changed after `beginInstan
      .format("org.apache.hudi")
      .option(DataSourceReadOptions.QUERY_TYPE_OPT_KEY(), 
DataSourceReadOptions.QUERY_TYPE_INCREMENTAL_OPT_VAL())
      .option(DataSourceReadOptions.BEGIN_INSTANTTIME_OPT_KEY(), 
<beginInstantTime>)
+     .option(DataSourceReadOptions.INCR_PATH_GLOB_OPT_KEY(), 
"/year=2020/month=*/day=*") // Optional, use glob pattern if querying certain 
partitions
      .load(tablePath); // For incremental query, pass in the root/base path of 
table
      
 hudiIncQueryDF.createOrReplaceTempView("hudi_trips_incremental")

[incubator-hudi] branch asf-site updated: [MINOR] Update doc to include inc query on partitions (#1454)

Reply via email to