This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 50fe4a678ab [feat](hive) add hive_parquet_use_column_names 
description. (#2287)
50fe4a678ab is described below

commit 50fe4a678aba4449d3560e41e0459f5af8341c0d
Author: daidai <[email protected]>
AuthorDate: Tue Apr 15 07:48:51 2025 +0800

    [feat](hive) add hive_parquet_use_column_names description. (#2287)
    
    ## Versions
    
    - [x] dev
    - [ ] 3.0
    - [ ] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [x] Chinese
    - [ ] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
    
    ---------
    
    Co-authored-by: morningman <[email protected]>
---
 docs/faq/lakehouse-faq.md                          | 26 +++++++++++++---------
 docs/lakehouse/catalogs/hive-catalog.md            | 16 ++++++++++++-
 .../current/faq/lakehouse-faq.md                   | 20 +++++++++++------
 .../current/lakehouse/catalogs/hive-catalog.md     | 22 ++++++++++++++----
 .../version-2.1/faq/lakehouse-faq.md               | 20 +++++++++++------
 .../version-3.0/faq/lakehouse-faq.md               | 20 +++++++++++------
 versioned_docs/version-2.1/faq/lakehouse-faq.md    | 26 +++++++++++++---------
 versioned_docs/version-3.0/faq/lakehouse-faq.md    | 26 +++++++++++++---------
 8 files changed, 120 insertions(+), 56 deletions(-)

diff --git a/docs/faq/lakehouse-faq.md b/docs/faq/lakehouse-faq.md
index b62f2a28213..cf7544c14c1 100644
--- a/docs/faq/lakehouse-faq.md
+++ b/docs/faq/lakehouse-faq.md
@@ -126,17 +126,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
 ## Hive Catalog
 
-1. Error accessing Iceberg table via Hive Metastore: `failed to get schema` or 
`Storage schema reading not supported`
+1. Accessing Iceberg or Hive table through Hive Catalog reports an error: 
`failed to get schema` or `Storage schema reading not supported`
 
-   Place the relevant `iceberg` runtime jar files in Hive's lib/ directory.
-
-   Configure in `hive-site.xml`:
-
-   ```
-   
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
-   ```
-
-   After configuration, restart the Hive Metastore.
+    You can try the following methods:
+    
+    * Put the `iceberg` runtime-related jar package in the lib/ directory of 
Hive.
+    
+    * Configure in `hive-site.xml`:
+    
+        ```
+        
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
+        ```
+        
+        After the configuration is completed, you need to restart the Hive 
Metastore.
+    
+    * Add `"get_schema_from_table" = "true"` in the Catalog properties
+    
+        This parameter is supported since versions 2.1.10 and 3.0.6.
 
 2. Error connecting to Hive Catalog: `Caused by: 
java.lang.NullPointerException`
 
diff --git a/docs/lakehouse/catalogs/hive-catalog.md 
b/docs/lakehouse/catalogs/hive-catalog.md
index 78a000b9280..aaa6ae10cf7 100644
--- a/docs/lakehouse/catalogs/hive-catalog.md
+++ b/docs/lakehouse/catalogs/hive-catalog.md
@@ -48,7 +48,8 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
     'fs.defaultFS' = '<fs_defaultfs>', -- optional
     {MetaStoreProperties},
     {StorageProperties},
-    {CommonProperties}
+    {CommonProperties},
+    {OtherProperties}
 );
 ```
 
@@ -78,6 +79,12 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
 
   The CommonProperties section is for entering common attributes. Please see 
the "Common Properties" section in the [Catalog 
Overview](../catalog-overview.md).
 
+* `{OtherProperties}`
+
+  OtherProperties section is for entering properties related to Hive Catalog.
+
+  * `get_schema_from_table`:The default value is false. By default, Doris will 
obtain the table schema information from the Hive Metastore. However, in some 
cases, compatibility issues may occur, such as the error `Storage schema 
reading not supported`. In this case, you can set this parameter to true, and 
the table schema will be obtained directly from the Table object. But please 
note that this method will cause the default value information of the column to 
be ignored. This property i [...]
+
 ### Supported Hive Versions
 
 Supports Hive 1.x, 2.x, 3.x, and 4.x.
@@ -348,6 +355,13 @@ AS SELECT col1, pt1 AS col2, pt2 AS pt1 FROM 
test_ctas.part_ctas_src WHERE col1
 
 ### Related Parameters
 
+* Session variables
+
+| Parameter name | Default value | Desciption | Since version |
+| ----------| ---- | ---- | --- |
+| `hive_parquet_use_column_names` | `true` | When Doris reads the Parquet data 
type of the Hive table, it will find the column with the same name from the 
Parquet file to read the data according to the column name of the Hive table by 
default. When this variable is `false`, Doris will read data from the Parquet 
file according to the column order in the Hive table, regardless of the column 
name. Similar to the `parquet.column.index.access` variable in Hive. This 
parameter only applies to  [...]
+| `hive_orc_use_column_names` | `true` | Similar to 
`hive_parquet_use_column_names`, it is for the Hive table ORC data type. 
Similar to the `orc.force.positional.evolution` variable in Hive. | 2.1.6+, 
3.0.3+ |
+
 * BE
 
   | Parameter Name                                                             
   | Default Value                                                              
                                       | Description |
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
index f7c3ea2d82b..d867858a15e 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
@@ -128,17 +128,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
 ## Hive Catalog
 
-1. 通过 Hive Metastore 访问 Iceberg 表报错:`failed to get schema` 或 `Storage schema 
reading not supported`
+1. 通过 Hive Catalog 访问 Iceberg 或 Hive 表报错:`failed to get schema` 或 `Storage 
schema reading not supported`
 
-   在 Hive 的 lib/ 目录放上 `iceberg` 运行时有关的 jar 包。
+   可以尝试以下方法:
 
-   在 `hive-site.xml` 配置:
+   * 在 Hive 的 lib/ 目录放上 `iceberg` 运行时有关的 jar 包。
 
-   ```
-   
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
-   ```
+   * 在 `hive-site.xml` 配置:
+
+       ```
+       
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
+       ```
+
+       配置完成后需要重启 Hive Metastore。
+
+   * 在 Catalog 属性中添加 `"get_schema_from_table" = "true"`
 
-   配置完成后需要重启 Hive Metastore。
+       该参数自 2.1.10 和 3.0.6 版本支持。
 
 2. 连接 Hive Catalog 报错:`Caused by: java.lang.NullPointerException`
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
index b5be52478b8..723bbdea8c9 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
@@ -48,7 +48,8 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
     'fs.defaultFS' = '<fs_defaultfs>', -- optional
     {MetaStoreProperties},
     {StorageProperties},
-    {CommonProperties}
+    {CommonProperties},
+    {OtherProperties}
 );
 ```
 
@@ -80,6 +81,12 @@ CREATE CATALOG [IF NOT EXISTS] catalog_name PROPERTIES (
 
   CommonProperties 部分用于填写通用属性。请参阅[ 数据目录概述 ](../catalog-overview.md)中【通用属性】部分。
 
+* `{OtherProperties}`
+
+  OtherProperties 部分用于填写和 Hive Catalog 相关的其他参数。
+
+  * `get_schema_from_table`:默认为 false。默认情况下,Doris 会从 Hive Metastore 中获取表的 
Schema 信息。但某些情况下可能出现兼容问题,如错误 `Storage schema reading not 
supported`。此时可以将这个参数设置为 true,则会从 Table 对象中直接获取表 
Schema。但注意,该方式会导致列的默认值信息被忽略。该参数自 2.1.10 和 3.0.6 版本支持。
+
 ### 支持的 Hive 版本
 
 支持 Hive 1.x,2.x,3.x,4.x。
@@ -357,10 +364,17 @@ AS SELECT col1,pt1 as col2,pt2 as pt1 FROM 
test_ctas.part_ctas_src WHERE col1>0;
 
 ### 相关参数
 
-* BE
+* Session 变量
+
+  | 参数名称  | 描述  | 默认值 | 版本 |
+  | ----------| ---- | ---- | --- |
+  | `hive_parquet_use_column_names` | `true` | Doris 在读取 Hive 表 Parquet 
数据类型时,默认会根据 Hive 表的列名从 Parquet 文件中找同名的列来读取数据。当该变量为 `false` 时,Doris 会根据 Hive 
表中的列顺序从 Parquet 文件中读取数据,与列名无关。类似于 Hive 中的 `parquet.column.index.access` 
变量。该参数只适用于顶层列名,对 Struct 内部无效。 | 2.1.6+, 3.0.3+ |
+  | `hive_orc_use_column_names`     | `true` | 与 
`hive_parquet_use_column_names` 类似,针对的是 Hive 表 ORC 数据类型。类似于 Hive 中的 
`orc.force.positional.evolution` 变量。  | 2.1.6+, 3.0.3+ |
+
+* BE 配置
 
-  | 参数名称                                                                       
   | 默认值                                                                        
                                                                                
                                                                                
                                     | 描述   |
-  | 
----------------------------------------------------------------------------- | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 | ---- |
+  | 参数名称                                                                       
   | 描述                                                                         
                                                                                
                                                                                
                                     | 默认值 |
+  | 
----------------------------------------------------------------------------- | 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 | ----- |
   | `hive_sink_max_file_size`                                                  
   | 最大的数据文件大小。当写入数据量超过该大小后会关闭当前文件,滚动产生一个新文件继续写入。                               
                                                                                
                                                                                
                                     | 1GB  |
   | `table_sink_partition_write_max_partition_nums_per_writer`                 
   | BE 节点上每个 Instance 最大写入的分区数目。                                               
                                                                                
                                                                                
                                     | 128  |
   | `table_sink_non_partition_write_scaling_data_processed_threshold`          
   | 非分区表开始 scaling-write 的数据量阈值。每增加 
`table_sink_non_partition_write_scaling_data_processed_threshold` 数据就会发送给一个新的 
writer(instance) 进行写入。scaling-write 机制主要是为了根据数据量来使用不同数目的 writer(instance) 
来进行写入,会随着数据量的增加而增大写入的 writer(instance) 
数目,从而提高并发写入的吞吐。当数据量比较少的时候也会节省资源,并且尽可能地减少产生的文件数目。 | 25MB |
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/lakehouse-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/lakehouse-faq.md
index be95eed6b8b..d544ef42fd3 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/faq/lakehouse-faq.md
@@ -128,17 +128,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
 ## Hive Catalog
 
-1. 通过 Hive Metastore 访问 Iceberg 表报错:`failed to get schema` 或 `Storage schema 
reading not supported`
+1. 通过 Hive Catalog 访问 Iceberg 或 Hive 表报错:`failed to get schema` 或 `Storage 
schema reading not supported`
 
-   在 Hive 的 lib/ 目录放上 `iceberg` 运行时有关的 jar 包。
+   可以尝试以下方法:
 
-   在 `hive-site.xml` 配置:
+   * 在 Hive 的 lib/ 目录放上 `iceberg` 运行时有关的 jar 包。
 
-   ```
-   
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
-   ```
+   * 在 `hive-site.xml` 配置:
+
+       ```
+       
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
+       ```
+
+       配置完成后需要重启 Hive Metastore。
+
+   * 在 Catalog 属性中添加 `"get_schema_from_table" = "true"`
 
-   配置完成后需要重启 Hive Metastore。
+       该参数自 2.1.10 和 3.0.6 版本支持。
 
 2. 连接 Hive Catalog 报错:`Caused by: java.lang.NullPointerException`
 
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/lakehouse-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/lakehouse-faq.md
index be95eed6b8b..d544ef42fd3 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/faq/lakehouse-faq.md
@@ -128,17 +128,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
 ## Hive Catalog
 
-1. 通过 Hive Metastore 访问 Iceberg 表报错:`failed to get schema` 或 `Storage schema 
reading not supported`
+1. 通过 Hive Catalog 访问 Iceberg 或 Hive 表报错:`failed to get schema` 或 `Storage 
schema reading not supported`
 
-   在 Hive 的 lib/ 目录放上 `iceberg` 运行时有关的 jar 包。
+   可以尝试以下方法:
 
-   在 `hive-site.xml` 配置:
+   * 在 Hive 的 lib/ 目录放上 `iceberg` 运行时有关的 jar 包。
 
-   ```
-   
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
-   ```
+   * 在 `hive-site.xml` 配置:
+
+       ```
+       
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
+       ```
+
+       配置完成后需要重启 Hive Metastore。
+
+   * 在 Catalog 属性中添加 `"get_schema_from_table" = "true"`
 
-   配置完成后需要重启 Hive Metastore。
+       该参数自 2.1.10 和 3.0.6 版本支持。
 
 2. 连接 Hive Catalog 报错:`Caused by: java.lang.NullPointerException`
 
diff --git a/versioned_docs/version-2.1/faq/lakehouse-faq.md 
b/versioned_docs/version-2.1/faq/lakehouse-faq.md
index b62f2a28213..cf7544c14c1 100644
--- a/versioned_docs/version-2.1/faq/lakehouse-faq.md
+++ b/versioned_docs/version-2.1/faq/lakehouse-faq.md
@@ -126,17 +126,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
 ## Hive Catalog
 
-1. Error accessing Iceberg table via Hive Metastore: `failed to get schema` or 
`Storage schema reading not supported`
+1. Accessing Iceberg or Hive table through Hive Catalog reports an error: 
`failed to get schema` or `Storage schema reading not supported`
 
-   Place the relevant `iceberg` runtime jar files in Hive's lib/ directory.
-
-   Configure in `hive-site.xml`:
-
-   ```
-   
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
-   ```
-
-   After configuration, restart the Hive Metastore.
+    You can try the following methods:
+    
+    * Put the `iceberg` runtime-related jar package in the lib/ directory of 
Hive.
+    
+    * Configure in `hive-site.xml`:
+    
+        ```
+        
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
+        ```
+        
+        After the configuration is completed, you need to restart the Hive 
Metastore.
+    
+    * Add `"get_schema_from_table" = "true"` in the Catalog properties
+    
+        This parameter is supported since versions 2.1.10 and 3.0.6.
 
 2. Error connecting to Hive Catalog: `Caused by: 
java.lang.NullPointerException`
 
diff --git a/versioned_docs/version-3.0/faq/lakehouse-faq.md 
b/versioned_docs/version-3.0/faq/lakehouse-faq.md
index b62f2a28213..cf7544c14c1 100644
--- a/versioned_docs/version-3.0/faq/lakehouse-faq.md
+++ b/versioned_docs/version-3.0/faq/lakehouse-faq.md
@@ -126,17 +126,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
 ## Hive Catalog
 
-1. Error accessing Iceberg table via Hive Metastore: `failed to get schema` or 
`Storage schema reading not supported`
+1. Accessing Iceberg or Hive table through Hive Catalog reports an error: 
`failed to get schema` or `Storage schema reading not supported`
 
-   Place the relevant `iceberg` runtime jar files in Hive's lib/ directory.
-
-   Configure in `hive-site.xml`:
-
-   ```
-   
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
-   ```
-
-   After configuration, restart the Hive Metastore.
+    You can try the following methods:
+    
+    * Put the `iceberg` runtime-related jar package in the lib/ directory of 
Hive.
+    
+    * Configure in `hive-site.xml`:
+    
+        ```
+        
metastore.storage.schema.reader.impl=org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader
+        ```
+        
+        After the configuration is completed, you need to restart the Hive 
Metastore.
+    
+    * Add `"get_schema_from_table" = "true"` in the Catalog properties
+    
+        This parameter is supported since versions 2.1.10 and 3.0.6.
 
 2. Error connecting to Hive Catalog: `Caused by: 
java.lang.NullPointerException`
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to