This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 8d20e49921c [opt] add more lakehouse faq (#3484)
8d20e49921c is described below

commit 8d20e49921c1c65b0ed71da0411e8d4179cf7951
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Tue Mar 24 19:37:07 2026 -0700

    [opt] add more lakehouse faq (#3484)
    
    ## Versions
    
    - [x] dev
    - [x] 4.x
    - [x] 3.x
    - [ ] 2.1
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/faq/lakehouse-faq.md                          | 39 ++++++++++++++++++++++
 .../current/faq/lakehouse-faq.md                   | 39 ++++++++++++++++++++++
 .../version-3.x/faq/lakehouse-faq.md               | 39 ++++++++++++++++++++++
 .../version-4.x/faq/lakehouse-faq.md               | 39 ++++++++++++++++++++++
 versioned_docs/version-3.x/faq/lakehouse-faq.md    | 39 ++++++++++++++++++++++
 versioned_docs/version-4.x/faq/lakehouse-faq.md    | 39 ++++++++++++++++++++++
 6 files changed, 234 insertions(+)

diff --git a/docs/faq/lakehouse-faq.md b/docs/faq/lakehouse-faq.md
index 6bcfc6b7715..a5103977fde 100644
--- a/docs/faq/lakehouse-faq.md
+++ b/docs/faq/lakehouse-faq.md
@@ -253,6 +253,28 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
    If the session timezone is already set to `Asia/Shanghai` but the query 
still fails, it indicates that the ORC file was generated with the timezone 
`+08:00`. During query execution, this timezone is required when parsing the 
ORC footer. In this case, you can try creating a symbolic link under the 
`/usr/share/zoneinfo/` directory that points `+08:00` to an equivalent timezone.
 
+14. When querying a Hive table that uses JSON SerDe (e.g., 
`org.openx.data.jsonserde.JsonSerDe`), an error occurs: `failed to get schema` 
or `Storage schema reading not supported`
+
+    When a Hive table uses JSON format storage (ROW FORMAT SERDE is 
`org.openx.data.jsonserde.JsonSerDe`), the Hive Metastore may not be able to 
read the table's schema information through the default method, causing the 
following error when querying from Doris:
+
+    ```
+    errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+    reason: org.apache.hadoop.hive.metastore.api.MetaException:
+    java.lang.UnsupportedOperationException: Storage schema reading not 
supported
+    ```
+
+    This can be resolved by adding `"get_schema_from_table" = "true"` in the 
Catalog properties. This parameter instructs Doris to retrieve the schema 
directly from the Hive table metadata instead of relying on the underlying 
storage's Schema Reader.
+
+    ```sql
+    CREATE CATALOG hive PROPERTIES (
+        'type' = 'hms',
+        'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+        'get_schema_from_table' = 'true'
+    );
+    ```
+
+    This parameter is supported since versions 2.1.10 and 3.0.6.
+
 ## HDFS
 
 1. When accessing HDFS 3.x, if you encounter the error `java.lang.VerifyError: 
xxx`, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need 
to update to 2.10.2 or upgrade Doris to versions after 1.2.2.
@@ -322,6 +344,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
         - Copy `hdfs-site.xml` and `core-site.xml` to `fe/conf` and `be/conf`. 
(Recommended)
         - In `hdfs-site.xml`, find the corresponding configuration 
`dfs.data.transfer.protection` and set this parameter in the catalog.
 
+5. When querying a Hive Catalog table, an error occurs: `RPC response has a 
length of xxx exceeds maximum data length`
+
+    For example:
+
+    ```
+    RPC response has a length of 1213486160 exceeds maximum data length
+    ```
+
+    The value `1213486160` in hexadecimal is `0x48545450`, which corresponds 
to the ASCII string `"HTTP"`. This indicates that the Doris FE attempted to 
connect to an HDFS NameNode RPC port, but received an HTTP response instead.
+
+    The root cause is that the HDFS NameNode port configured in the Catalog or 
in `hdfs-site.xml` is incorrect — an HTTP port was used where an RPC port is 
required. HDFS NameNode typically exposes two types of ports:
+
+    - **RPC port** (default: `8020` or `9000`): Used for HDFS client 
communication (this is the correct port for Doris).
+    - **HTTP port** (default: `9870` or `50070`): Used for the NameNode Web UI.
+
+    Check the HDFS NameNode port configuration in the Catalog properties or in 
`hdfs-site.xml` under `fe/conf` and `be/conf`, and ensure it is set to the RPC 
port (`dfs.namenode.rpc-address`), not the HTTP port 
(`dfs.namenode.http-address`).
+
 ## DLF Catalog
 
 1. When using the DLF Catalog, if `Invalid address` occurs during BE reading 
JindoFS data, add the domain name appearing in the logs to IP mapping in 
`/etc/hosts`.
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
index 7370f2560d0..c2679e6dcd6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
@@ -279,6 +279,28 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
     如果 `session` 时区已经是 `Asia/Shanghai`,且查询仍然报错,说明生成 ORC 文件时的时区是 `+08:00`, 
导致在读取时解析 `footer` 时需要用到 `+08:00` 时区,可以尝试在 `/usr/share/zoneinfo/` 目录下面软链到相同时区上。
 
+14. 查询使用 JSON SerDe(如 `org.openx.data.jsonserde.JsonSerDe`)的 Hive 
表时,报错:`failed to get schema` 或 `Storage schema reading not supported`
+
+    当 Hive 表使用 JSON 格式存储(ROW FORMAT SERDE 为 
`org.openx.data.jsonserde.JsonSerDe`)时,Hive Metastore 可能无法通过默认方式读取表的 Schema 
信息,导致 Doris 查询时报错:
+
+    ```
+    errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+    reason: org.apache.hadoop.hive.metastore.api.MetaException:
+    java.lang.UnsupportedOperationException: Storage schema reading not 
supported
+    ```
+
+    可以在 Catalog 属性中添加 `"get_schema_from_table" = "true"` 解决,该参数会让 Doris 直接从 
Hive 表的元数据中获取 Schema,而不依赖底层存储的 Schema Reader。
+
+    ```sql
+    CREATE CATALOG hive PROPERTIES (
+        'type' = 'hms',
+        'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+        'get_schema_from_table' = 'true'
+    );
+    ```
+
+    该参数自 2.1.10 和 3.0.6 版本支持。
+
 ## HDFS
 
 1. 访问 HDFS 3.x 时报错:`java.lang.VerifyError: xxx`
@@ -353,6 +375,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
         - 拷贝 `hdfs-site.xml` 以及 `core-site.xml` 到 `fe/conf` 和 `be/conf` 目录。(推荐)
         - 在 `hdfs-site.xml` 找到相应的配置 `dfs.data.transfer.protection`,并且在 catalog 
里面设置该参数。
 
+5. 查询 Hive Catalog 表时报错:`RPC response has a length of xxx exceeds maximum data 
length`
+
+    例如:
+
+    ```
+    RPC response has a length of 1213486160 exceeds maximum data length
+    ```
+
+    其中 `1213486160` 转换为十六进制为 `0x48545450`,对应 ASCII 字符串 `"HTTP"`。这说明 Doris FE 
尝试连接 HDFS NameNode 的 RPC 端口时,实际收到了 HTTP 响应。
+
+    根本原因是 Catalog 中或 `hdfs-site.xml` 中配置的 HDFS NameNode 端口不正确——错误地使用了 HTTP 
端口而非 RPC 端口。HDFS NameNode 通常暴露两种端口:
+
+    - **RPC 端口**(默认:`8020` 或 `9000`):用于 HDFS 客户端通信(Doris 应使用此端口)。
+    - **HTTP 端口**(默认:`9870` 或 `50070`):用于 NameNode Web UI。
+
+    请检查 Catalog 属性或 `fe/conf`、`be/conf` 下 `hdfs-site.xml` 中的 HDFS NameNode 
端口配置,确保使用的是 RPC 端口(`dfs.namenode.rpc-address`),而非 HTTP 
端口(`dfs.namenode.http-address`)。
+
 ## DLF Catalog
 
 1. 使用 DLF Catalog 时,BE 读在取 JindoFS 数据出现`Invalid 
address`,需要在`/ets/hosts`中添加日志中出现的域名到 IP 的映射。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
index 7370f2560d0..c2679e6dcd6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
@@ -279,6 +279,28 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
     如果 `session` 时区已经是 `Asia/Shanghai`,且查询仍然报错,说明生成 ORC 文件时的时区是 `+08:00`, 
导致在读取时解析 `footer` 时需要用到 `+08:00` 时区,可以尝试在 `/usr/share/zoneinfo/` 目录下面软链到相同时区上。
 
+14. 查询使用 JSON SerDe(如 `org.openx.data.jsonserde.JsonSerDe`)的 Hive 
表时,报错:`failed to get schema` 或 `Storage schema reading not supported`
+
+    当 Hive 表使用 JSON 格式存储(ROW FORMAT SERDE 为 
`org.openx.data.jsonserde.JsonSerDe`)时,Hive Metastore 可能无法通过默认方式读取表的 Schema 
信息,导致 Doris 查询时报错:
+
+    ```
+    errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+    reason: org.apache.hadoop.hive.metastore.api.MetaException:
+    java.lang.UnsupportedOperationException: Storage schema reading not 
supported
+    ```
+
+    可以在 Catalog 属性中添加 `"get_schema_from_table" = "true"` 解决,该参数会让 Doris 直接从 
Hive 表的元数据中获取 Schema,而不依赖底层存储的 Schema Reader。
+
+    ```sql
+    CREATE CATALOG hive PROPERTIES (
+        'type' = 'hms',
+        'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+        'get_schema_from_table' = 'true'
+    );
+    ```
+
+    该参数自 2.1.10 和 3.0.6 版本支持。
+
 ## HDFS
 
 1. 访问 HDFS 3.x 时报错:`java.lang.VerifyError: xxx`
@@ -353,6 +375,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
         - 拷贝 `hdfs-site.xml` 以及 `core-site.xml` 到 `fe/conf` 和 `be/conf` 目录。(推荐)
         - 在 `hdfs-site.xml` 找到相应的配置 `dfs.data.transfer.protection`,并且在 catalog 
里面设置该参数。
 
+5. 查询 Hive Catalog 表时报错:`RPC response has a length of xxx exceeds maximum data 
length`
+
+    例如:
+
+    ```
+    RPC response has a length of 1213486160 exceeds maximum data length
+    ```
+
+    其中 `1213486160` 转换为十六进制为 `0x48545450`,对应 ASCII 字符串 `"HTTP"`。这说明 Doris FE 
尝试连接 HDFS NameNode 的 RPC 端口时,实际收到了 HTTP 响应。
+
+    根本原因是 Catalog 中或 `hdfs-site.xml` 中配置的 HDFS NameNode 端口不正确——错误地使用了 HTTP 
端口而非 RPC 端口。HDFS NameNode 通常暴露两种端口:
+
+    - **RPC 端口**(默认:`8020` 或 `9000`):用于 HDFS 客户端通信(Doris 应使用此端口)。
+    - **HTTP 端口**(默认:`9870` 或 `50070`):用于 NameNode Web UI。
+
+    请检查 Catalog 属性或 `fe/conf`、`be/conf` 下 `hdfs-site.xml` 中的 HDFS NameNode 
端口配置,确保使用的是 RPC 端口(`dfs.namenode.rpc-address`),而非 HTTP 
端口(`dfs.namenode.http-address`)。
+
 ## DLF Catalog
 
 1. 使用 DLF Catalog 时,BE 读在取 JindoFS 数据出现`Invalid 
address`,需要在`/ets/hosts`中添加日志中出现的域名到 IP 的映射。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
index 7370f2560d0..c2679e6dcd6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
@@ -279,6 +279,28 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
     如果 `session` 时区已经是 `Asia/Shanghai`,且查询仍然报错,说明生成 ORC 文件时的时区是 `+08:00`, 
导致在读取时解析 `footer` 时需要用到 `+08:00` 时区,可以尝试在 `/usr/share/zoneinfo/` 目录下面软链到相同时区上。
 
+14. 查询使用 JSON SerDe(如 `org.openx.data.jsonserde.JsonSerDe`)的 Hive 
表时,报错:`failed to get schema` 或 `Storage schema reading not supported`
+
+    当 Hive 表使用 JSON 格式存储(ROW FORMAT SERDE 为 
`org.openx.data.jsonserde.JsonSerDe`)时,Hive Metastore 可能无法通过默认方式读取表的 Schema 
信息,导致 Doris 查询时报错:
+
+    ```
+    errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+    reason: org.apache.hadoop.hive.metastore.api.MetaException:
+    java.lang.UnsupportedOperationException: Storage schema reading not 
supported
+    ```
+
+    可以在 Catalog 属性中添加 `"get_schema_from_table" = "true"` 解决,该参数会让 Doris 直接从 
Hive 表的元数据中获取 Schema,而不依赖底层存储的 Schema Reader。
+
+    ```sql
+    CREATE CATALOG hive PROPERTIES (
+        'type' = 'hms',
+        'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+        'get_schema_from_table' = 'true'
+    );
+    ```
+
+    该参数自 2.1.10 和 3.0.6 版本支持。
+
 ## HDFS
 
 1. 访问 HDFS 3.x 时报错:`java.lang.VerifyError: xxx`
@@ -353,6 +375,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
         - 拷贝 `hdfs-site.xml` 以及 `core-site.xml` 到 `fe/conf` 和 `be/conf` 目录。(推荐)
         - 在 `hdfs-site.xml` 找到相应的配置 `dfs.data.transfer.protection`,并且在 catalog 
里面设置该参数。
 
+5. 查询 Hive Catalog 表时报错:`RPC response has a length of xxx exceeds maximum data 
length`
+
+    例如:
+
+    ```
+    RPC response has a length of 1213486160 exceeds maximum data length
+    ```
+
+    其中 `1213486160` 转换为十六进制为 `0x48545450`,对应 ASCII 字符串 `"HTTP"`。这说明 Doris FE 
尝试连接 HDFS NameNode 的 RPC 端口时,实际收到了 HTTP 响应。
+
+    根本原因是 Catalog 中或 `hdfs-site.xml` 中配置的 HDFS NameNode 端口不正确——错误地使用了 HTTP 
端口而非 RPC 端口。HDFS NameNode 通常暴露两种端口:
+
+    - **RPC 端口**(默认:`8020` 或 `9000`):用于 HDFS 客户端通信(Doris 应使用此端口)。
+    - **HTTP 端口**(默认:`9870` 或 `50070`):用于 NameNode Web UI。
+
+    请检查 Catalog 属性或 `fe/conf`、`be/conf` 下 `hdfs-site.xml` 中的 HDFS NameNode 
端口配置,确保使用的是 RPC 端口(`dfs.namenode.rpc-address`),而非 HTTP 
端口(`dfs.namenode.http-address`)。
+
 ## DLF Catalog
 
 1. 使用 DLF Catalog 时,BE 读在取 JindoFS 数据出现`Invalid 
address`,需要在`/ets/hosts`中添加日志中出现的域名到 IP 的映射。
diff --git a/versioned_docs/version-3.x/faq/lakehouse-faq.md 
b/versioned_docs/version-3.x/faq/lakehouse-faq.md
index 6bcfc6b7715..a5103977fde 100644
--- a/versioned_docs/version-3.x/faq/lakehouse-faq.md
+++ b/versioned_docs/version-3.x/faq/lakehouse-faq.md
@@ -253,6 +253,28 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
    If the session timezone is already set to `Asia/Shanghai` but the query 
still fails, it indicates that the ORC file was generated with the timezone 
`+08:00`. During query execution, this timezone is required when parsing the 
ORC footer. In this case, you can try creating a symbolic link under the 
`/usr/share/zoneinfo/` directory that points `+08:00` to an equivalent timezone.
 
+14. When querying a Hive table that uses JSON SerDe (e.g., 
`org.openx.data.jsonserde.JsonSerDe`), an error occurs: `failed to get schema` 
or `Storage schema reading not supported`
+
+    When a Hive table uses JSON format storage (ROW FORMAT SERDE is 
`org.openx.data.jsonserde.JsonSerDe`), the Hive Metastore may not be able to 
read the table's schema information through the default method, causing the 
following error when querying from Doris:
+
+    ```
+    errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+    reason: org.apache.hadoop.hive.metastore.api.MetaException:
+    java.lang.UnsupportedOperationException: Storage schema reading not 
supported
+    ```
+
+    This can be resolved by adding `"get_schema_from_table" = "true"` in the 
Catalog properties. This parameter instructs Doris to retrieve the schema 
directly from the Hive table metadata instead of relying on the underlying 
storage's Schema Reader.
+
+    ```sql
+    CREATE CATALOG hive PROPERTIES (
+        'type' = 'hms',
+        'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+        'get_schema_from_table' = 'true'
+    );
+    ```
+
+    This parameter is supported since versions 2.1.10 and 3.0.6.
+
 ## HDFS
 
 1. When accessing HDFS 3.x, if you encounter the error `java.lang.VerifyError: 
xxx`, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need 
to update to 2.10.2 or upgrade Doris to versions after 1.2.2.
@@ -322,6 +344,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
         - Copy `hdfs-site.xml` and `core-site.xml` to `fe/conf` and `be/conf`. 
(Recommended)
         - In `hdfs-site.xml`, find the corresponding configuration 
`dfs.data.transfer.protection` and set this parameter in the catalog.
 
+5. When querying a Hive Catalog table, an error occurs: `RPC response has a 
length of xxx exceeds maximum data length`
+
+    For example:
+
+    ```
+    RPC response has a length of 1213486160 exceeds maximum data length
+    ```
+
+    The value `1213486160` in hexadecimal is `0x48545450`, which corresponds 
to the ASCII string `"HTTP"`. This indicates that the Doris FE attempted to 
connect to an HDFS NameNode RPC port, but received an HTTP response instead.
+
+    The root cause is that the HDFS NameNode port configured in the Catalog or 
in `hdfs-site.xml` is incorrect — an HTTP port was used where an RPC port is 
required. HDFS NameNode typically exposes two types of ports:
+
+    - **RPC port** (default: `8020` or `9000`): Used for HDFS client 
communication (this is the correct port for Doris).
+    - **HTTP port** (default: `9870` or `50070`): Used for the NameNode Web UI.
+
+    Check the HDFS NameNode port configuration in the Catalog properties or in 
`hdfs-site.xml` under `fe/conf` and `be/conf`, and ensure it is set to the RPC 
port (`dfs.namenode.rpc-address`), not the HTTP port 
(`dfs.namenode.http-address`).
+
 ## DLF Catalog
 
 1. When using the DLF Catalog, if `Invalid address` occurs during BE reading 
JindoFS data, add the domain name appearing in the logs to IP mapping in 
`/etc/hosts`.
diff --git a/versioned_docs/version-4.x/faq/lakehouse-faq.md 
b/versioned_docs/version-4.x/faq/lakehouse-faq.md
index 6bcfc6b7715..a5103977fde 100644
--- a/versioned_docs/version-4.x/faq/lakehouse-faq.md
+++ b/versioned_docs/version-4.x/faq/lakehouse-faq.md
@@ -253,6 +253,28 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
 
    If the session timezone is already set to `Asia/Shanghai` but the query 
still fails, it indicates that the ORC file was generated with the timezone 
`+08:00`. During query execution, this timezone is required when parsing the 
ORC footer. In this case, you can try creating a symbolic link under the 
`/usr/share/zoneinfo/` directory that points `+08:00` to an equivalent timezone.
 
+14. When querying a Hive table that uses JSON SerDe (e.g., 
`org.openx.data.jsonserde.JsonSerDe`), an error occurs: `failed to get schema` 
or `Storage schema reading not supported`
+
+    When a Hive table uses JSON format storage (ROW FORMAT SERDE is 
`org.openx.data.jsonserde.JsonSerDe`), the Hive Metastore may not be able to 
read the table's schema information through the default method, causing the 
following error when querying from Doris:
+
+    ```
+    errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+    reason: org.apache.hadoop.hive.metastore.api.MetaException:
+    java.lang.UnsupportedOperationException: Storage schema reading not 
supported
+    ```
+
+    This can be resolved by adding `"get_schema_from_table" = "true"` in the 
Catalog properties. This parameter instructs Doris to retrieve the schema 
directly from the Hive table metadata instead of relying on the underlying 
storage's Schema Reader.
+
+    ```sql
+    CREATE CATALOG hive PROPERTIES (
+        'type' = 'hms',
+        'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+        'get_schema_from_table' = 'true'
+    );
+    ```
+
+    This parameter is supported since versions 2.1.10 and 3.0.6.
+
 ## HDFS
 
 1. When accessing HDFS 3.x, if you encounter the error `java.lang.VerifyError: 
xxx`, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need 
to update to 2.10.2 or upgrade Doris to versions after 1.2.2.
@@ -322,6 +344,23 @@ ln -s 
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
         - Copy `hdfs-site.xml` and `core-site.xml` to `fe/conf` and `be/conf`. 
(Recommended)
         - In `hdfs-site.xml`, find the corresponding configuration 
`dfs.data.transfer.protection` and set this parameter in the catalog.
 
+5. When querying a Hive Catalog table, an error occurs: `RPC response has a 
length of xxx exceeds maximum data length`
+
+    For example:
+
+    ```
+    RPC response has a length of 1213486160 exceeds maximum data length
+    ```
+
+    The value `1213486160` in hexadecimal is `0x48545450`, which corresponds 
to the ASCII string `"HTTP"`. This indicates that the Doris FE attempted to 
connect to an HDFS NameNode RPC port, but received an HTTP response instead.
+
+    The root cause is that the HDFS NameNode port configured in the Catalog or 
in `hdfs-site.xml` is incorrect — an HTTP port was used where an RPC port is 
required. HDFS NameNode typically exposes two types of ports:
+
+    - **RPC port** (default: `8020` or `9000`): Used for HDFS client 
communication (this is the correct port for Doris).
+    - **HTTP port** (default: `9870` or `50070`): Used for the NameNode Web UI.
+
+    Check the HDFS NameNode port configuration in the Catalog properties or in 
`hdfs-site.xml` under `fe/conf` and `be/conf`, and ensure it is set to the RPC 
port (`dfs.namenode.rpc-address`), not the HTTP port 
(`dfs.namenode.http-address`).
+
 ## DLF Catalog
 
 1. When using the DLF Catalog, if `Invalid address` occurs during BE reading 
JindoFS data, add the domain name appearing in the logs to IP mapping in 
`/etc/hosts`.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to