This is an automated email from the ASF dual-hosted git repository.
morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push:
new 8d20e49921c [opt] add more lakehouse faq (#3484)
8d20e49921c is described below
commit 8d20e49921c1c65b0ed71da0411e8d4179cf7951
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Tue Mar 24 19:37:07 2026 -0700
[opt] add more lakehouse faq (#3484)
## Versions
- [x] dev
- [x] 4.x
- [x] 3.x
- [ ] 2.1
## Languages
- [x] Chinese
- [x] English
## Docs Checklist
- [ ] Checked by AI
- [ ] Test Cases Built
---
docs/faq/lakehouse-faq.md | 39 ++++++++++++++++++++++
.../current/faq/lakehouse-faq.md | 39 ++++++++++++++++++++++
.../version-3.x/faq/lakehouse-faq.md | 39 ++++++++++++++++++++++
.../version-4.x/faq/lakehouse-faq.md | 39 ++++++++++++++++++++++
versioned_docs/version-3.x/faq/lakehouse-faq.md | 39 ++++++++++++++++++++++
versioned_docs/version-4.x/faq/lakehouse-faq.md | 39 ++++++++++++++++++++++
6 files changed, 234 insertions(+)
diff --git a/docs/faq/lakehouse-faq.md b/docs/faq/lakehouse-faq.md
index 6bcfc6b7715..a5103977fde 100644
--- a/docs/faq/lakehouse-faq.md
+++ b/docs/faq/lakehouse-faq.md
@@ -253,6 +253,28 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
If the session timezone is already set to `Asia/Shanghai` but the query
still fails, it indicates that the ORC file was generated with the timezone
`+08:00`. During query execution, this timezone is required when parsing the
ORC footer. In this case, you can try creating a symbolic link under the
`/usr/share/zoneinfo/` directory that points `+08:00` to an equivalent timezone.
+14. When querying a Hive table that uses JSON SerDe (e.g.,
`org.openx.data.jsonserde.JsonSerDe`), an error occurs: `failed to get schema`
or `Storage schema reading not supported`
+
+ When a Hive table uses JSON format storage (ROW FORMAT SERDE is
`org.openx.data.jsonserde.JsonSerDe`), the Hive Metastore may not be able to
read the table's schema information through the default method, causing the
following error when querying from Doris:
+
+ ```
+ errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+ reason: org.apache.hadoop.hive.metastore.api.MetaException:
+ java.lang.UnsupportedOperationException: Storage schema reading not
supported
+ ```
+
+ This can be resolved by adding `"get_schema_from_table" = "true"` in the
Catalog properties. This parameter instructs Doris to retrieve the schema
directly from the Hive table metadata instead of relying on the underlying
storage's Schema Reader.
+
+ ```sql
+ CREATE CATALOG hive PROPERTIES (
+ 'type' = 'hms',
+ 'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+ 'get_schema_from_table' = 'true'
+ );
+ ```
+
+ This parameter is supported since versions 2.1.10 and 3.0.6.
+
## HDFS
1. When accessing HDFS 3.x, if you encounter the error `java.lang.VerifyError:
xxx`, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need
to update to 2.10.2 or upgrade Doris to versions after 1.2.2.
@@ -322,6 +344,23 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
- Copy `hdfs-site.xml` and `core-site.xml` to `fe/conf` and `be/conf`.
(Recommended)
- In `hdfs-site.xml`, find the corresponding configuration
`dfs.data.transfer.protection` and set this parameter in the catalog.
+5. When querying a Hive Catalog table, an error occurs: `RPC response has a
length of xxx exceeds maximum data length`
+
+ For example:
+
+ ```
+ RPC response has a length of 1213486160 exceeds maximum data length
+ ```
+
+ The value `1213486160` in hexadecimal is `0x48545450`, which corresponds
to the ASCII string `"HTTP"`. This indicates that the Doris FE attempted to
connect to an HDFS NameNode RPC port, but received an HTTP response instead.
+
+ The root cause is that the HDFS NameNode port configured in the Catalog or
in `hdfs-site.xml` is incorrect — an HTTP port was used where an RPC port is
required. HDFS NameNode typically exposes two types of ports:
+
+ - **RPC port** (default: `8020` or `9000`): Used for HDFS client
communication (this is the correct port for Doris).
+ - **HTTP port** (default: `9870` or `50070`): Used for the NameNode Web UI.
+
+ Check the HDFS NameNode port configuration in the Catalog properties or in
`hdfs-site.xml` under `fe/conf` and `be/conf`, and ensure it is set to the RPC
port (`dfs.namenode.rpc-address`), not the HTTP port
(`dfs.namenode.http-address`).
+
## DLF Catalog
1. When using the DLF Catalog, if `Invalid address` occurs during BE reading
JindoFS data, add the domain name appearing in the logs to IP mapping in
`/etc/hosts`.
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
index 7370f2560d0..c2679e6dcd6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/faq/lakehouse-faq.md
@@ -279,6 +279,28 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
如果 `session` 时区已经是 `Asia/Shanghai`,且查询仍然报错,说明生成 ORC 文件时的时区是 `+08:00`,
导致在读取时解析 `footer` 时需要用到 `+08:00` 时区,可以尝试在 `/usr/share/zoneinfo/` 目录下面软链到相同时区上。
+14. 查询使用 JSON SerDe(如 `org.openx.data.jsonserde.JsonSerDe`)的 Hive
表时,报错:`failed to get schema` 或 `Storage schema reading not supported`
+
+ 当 Hive 表使用 JSON 格式存储(ROW FORMAT SERDE 为
`org.openx.data.jsonserde.JsonSerDe`)时,Hive Metastore 可能无法通过默认方式读取表的 Schema
信息,导致 Doris 查询时报错:
+
+ ```
+ errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+ reason: org.apache.hadoop.hive.metastore.api.MetaException:
+ java.lang.UnsupportedOperationException: Storage schema reading not
supported
+ ```
+
+ 可以在 Catalog 属性中添加 `"get_schema_from_table" = "true"` 解决,该参数会让 Doris 直接从
Hive 表的元数据中获取 Schema,而不依赖底层存储的 Schema Reader。
+
+ ```sql
+ CREATE CATALOG hive PROPERTIES (
+ 'type' = 'hms',
+ 'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+ 'get_schema_from_table' = 'true'
+ );
+ ```
+
+ 该参数自 2.1.10 和 3.0.6 版本支持。
+
## HDFS
1. 访问 HDFS 3.x 时报错:`java.lang.VerifyError: xxx`
@@ -353,6 +375,23 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
- 拷贝 `hdfs-site.xml` 以及 `core-site.xml` 到 `fe/conf` 和 `be/conf` 目录。(推荐)
- 在 `hdfs-site.xml` 找到相应的配置 `dfs.data.transfer.protection`,并且在 catalog
里面设置该参数。
+5. 查询 Hive Catalog 表时报错:`RPC response has a length of xxx exceeds maximum data
length`
+
+ 例如:
+
+ ```
+ RPC response has a length of 1213486160 exceeds maximum data length
+ ```
+
+ 其中 `1213486160` 转换为十六进制为 `0x48545450`,对应 ASCII 字符串 `"HTTP"`。这说明 Doris FE
尝试连接 HDFS NameNode 的 RPC 端口时,实际收到了 HTTP 响应。
+
+ 根本原因是 Catalog 中或 `hdfs-site.xml` 中配置的 HDFS NameNode 端口不正确——错误地使用了 HTTP
端口而非 RPC 端口。HDFS NameNode 通常暴露两种端口:
+
+ - **RPC 端口**(默认:`8020` 或 `9000`):用于 HDFS 客户端通信(Doris 应使用此端口)。
+ - **HTTP 端口**(默认:`9870` 或 `50070`):用于 NameNode Web UI。
+
+ 请检查 Catalog 属性或 `fe/conf`、`be/conf` 下 `hdfs-site.xml` 中的 HDFS NameNode
端口配置,确保使用的是 RPC 端口(`dfs.namenode.rpc-address`),而非 HTTP
端口(`dfs.namenode.http-address`)。
+
## DLF Catalog
1. 使用 DLF Catalog 时,BE 读在取 JindoFS 数据出现`Invalid
address`,需要在`/ets/hosts`中添加日志中出现的域名到 IP 的映射。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
index 7370f2560d0..c2679e6dcd6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.x/faq/lakehouse-faq.md
@@ -279,6 +279,28 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
如果 `session` 时区已经是 `Asia/Shanghai`,且查询仍然报错,说明生成 ORC 文件时的时区是 `+08:00`,
导致在读取时解析 `footer` 时需要用到 `+08:00` 时区,可以尝试在 `/usr/share/zoneinfo/` 目录下面软链到相同时区上。
+14. 查询使用 JSON SerDe(如 `org.openx.data.jsonserde.JsonSerDe`)的 Hive
表时,报错:`failed to get schema` 或 `Storage schema reading not supported`
+
+ 当 Hive 表使用 JSON 格式存储(ROW FORMAT SERDE 为
`org.openx.data.jsonserde.JsonSerDe`)时,Hive Metastore 可能无法通过默认方式读取表的 Schema
信息,导致 Doris 查询时报错:
+
+ ```
+ errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+ reason: org.apache.hadoop.hive.metastore.api.MetaException:
+ java.lang.UnsupportedOperationException: Storage schema reading not
supported
+ ```
+
+ 可以在 Catalog 属性中添加 `"get_schema_from_table" = "true"` 解决,该参数会让 Doris 直接从
Hive 表的元数据中获取 Schema,而不依赖底层存储的 Schema Reader。
+
+ ```sql
+ CREATE CATALOG hive PROPERTIES (
+ 'type' = 'hms',
+ 'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+ 'get_schema_from_table' = 'true'
+ );
+ ```
+
+ 该参数自 2.1.10 和 3.0.6 版本支持。
+
## HDFS
1. 访问 HDFS 3.x 时报错:`java.lang.VerifyError: xxx`
@@ -353,6 +375,23 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
- 拷贝 `hdfs-site.xml` 以及 `core-site.xml` 到 `fe/conf` 和 `be/conf` 目录。(推荐)
- 在 `hdfs-site.xml` 找到相应的配置 `dfs.data.transfer.protection`,并且在 catalog
里面设置该参数。
+5. 查询 Hive Catalog 表时报错:`RPC response has a length of xxx exceeds maximum data
length`
+
+ 例如:
+
+ ```
+ RPC response has a length of 1213486160 exceeds maximum data length
+ ```
+
+ 其中 `1213486160` 转换为十六进制为 `0x48545450`,对应 ASCII 字符串 `"HTTP"`。这说明 Doris FE
尝试连接 HDFS NameNode 的 RPC 端口时,实际收到了 HTTP 响应。
+
+ 根本原因是 Catalog 中或 `hdfs-site.xml` 中配置的 HDFS NameNode 端口不正确——错误地使用了 HTTP
端口而非 RPC 端口。HDFS NameNode 通常暴露两种端口:
+
+ - **RPC 端口**(默认:`8020` 或 `9000`):用于 HDFS 客户端通信(Doris 应使用此端口)。
+ - **HTTP 端口**(默认:`9870` 或 `50070`):用于 NameNode Web UI。
+
+ 请检查 Catalog 属性或 `fe/conf`、`be/conf` 下 `hdfs-site.xml` 中的 HDFS NameNode
端口配置,确保使用的是 RPC 端口(`dfs.namenode.rpc-address`),而非 HTTP
端口(`dfs.namenode.http-address`)。
+
## DLF Catalog
1. 使用 DLF Catalog 时,BE 读在取 JindoFS 数据出现`Invalid
address`,需要在`/ets/hosts`中添加日志中出现的域名到 IP 的映射。
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
index 7370f2560d0..c2679e6dcd6 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-4.x/faq/lakehouse-faq.md
@@ -279,6 +279,28 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
如果 `session` 时区已经是 `Asia/Shanghai`,且查询仍然报错,说明生成 ORC 文件时的时区是 `+08:00`,
导致在读取时解析 `footer` 时需要用到 `+08:00` 时区,可以尝试在 `/usr/share/zoneinfo/` 目录下面软链到相同时区上。
+14. 查询使用 JSON SerDe(如 `org.openx.data.jsonserde.JsonSerDe`)的 Hive
表时,报错:`failed to get schema` 或 `Storage schema reading not supported`
+
+ 当 Hive 表使用 JSON 格式存储(ROW FORMAT SERDE 为
`org.openx.data.jsonserde.JsonSerDe`)时,Hive Metastore 可能无法通过默认方式读取表的 Schema
信息,导致 Doris 查询时报错:
+
+ ```
+ errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+ reason: org.apache.hadoop.hive.metastore.api.MetaException:
+ java.lang.UnsupportedOperationException: Storage schema reading not
supported
+ ```
+
+ 可以在 Catalog 属性中添加 `"get_schema_from_table" = "true"` 解决,该参数会让 Doris 直接从
Hive 表的元数据中获取 Schema,而不依赖底层存储的 Schema Reader。
+
+ ```sql
+ CREATE CATALOG hive PROPERTIES (
+ 'type' = 'hms',
+ 'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+ 'get_schema_from_table' = 'true'
+ );
+ ```
+
+ 该参数自 2.1.10 和 3.0.6 版本支持。
+
## HDFS
1. 访问 HDFS 3.x 时报错:`java.lang.VerifyError: xxx`
@@ -353,6 +375,23 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
- 拷贝 `hdfs-site.xml` 以及 `core-site.xml` 到 `fe/conf` 和 `be/conf` 目录。(推荐)
- 在 `hdfs-site.xml` 找到相应的配置 `dfs.data.transfer.protection`,并且在 catalog
里面设置该参数。
+5. 查询 Hive Catalog 表时报错:`RPC response has a length of xxx exceeds maximum data
length`
+
+ 例如:
+
+ ```
+ RPC response has a length of 1213486160 exceeds maximum data length
+ ```
+
+ 其中 `1213486160` 转换为十六进制为 `0x48545450`,对应 ASCII 字符串 `"HTTP"`。这说明 Doris FE
尝试连接 HDFS NameNode 的 RPC 端口时,实际收到了 HTTP 响应。
+
+ 根本原因是 Catalog 中或 `hdfs-site.xml` 中配置的 HDFS NameNode 端口不正确——错误地使用了 HTTP
端口而非 RPC 端口。HDFS NameNode 通常暴露两种端口:
+
+ - **RPC 端口**(默认:`8020` 或 `9000`):用于 HDFS 客户端通信(Doris 应使用此端口)。
+ - **HTTP 端口**(默认:`9870` 或 `50070`):用于 NameNode Web UI。
+
+ 请检查 Catalog 属性或 `fe/conf`、`be/conf` 下 `hdfs-site.xml` 中的 HDFS NameNode
端口配置,确保使用的是 RPC 端口(`dfs.namenode.rpc-address`),而非 HTTP
端口(`dfs.namenode.http-address`)。
+
## DLF Catalog
1. 使用 DLF Catalog 时,BE 读在取 JindoFS 数据出现`Invalid
address`,需要在`/ets/hosts`中添加日志中出现的域名到 IP 的映射。
diff --git a/versioned_docs/version-3.x/faq/lakehouse-faq.md
b/versioned_docs/version-3.x/faq/lakehouse-faq.md
index 6bcfc6b7715..a5103977fde 100644
--- a/versioned_docs/version-3.x/faq/lakehouse-faq.md
+++ b/versioned_docs/version-3.x/faq/lakehouse-faq.md
@@ -253,6 +253,28 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
If the session timezone is already set to `Asia/Shanghai` but the query
still fails, it indicates that the ORC file was generated with the timezone
`+08:00`. During query execution, this timezone is required when parsing the
ORC footer. In this case, you can try creating a symbolic link under the
`/usr/share/zoneinfo/` directory that points `+08:00` to an equivalent timezone.
+14. When querying a Hive table that uses JSON SerDe (e.g.,
`org.openx.data.jsonserde.JsonSerDe`), an error occurs: `failed to get schema`
or `Storage schema reading not supported`
+
+ When a Hive table uses JSON format storage (ROW FORMAT SERDE is
`org.openx.data.jsonserde.JsonSerDe`), the Hive Metastore may not be able to
read the table's schema information through the default method, causing the
following error when querying from Doris:
+
+ ```
+ errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+ reason: org.apache.hadoop.hive.metastore.api.MetaException:
+ java.lang.UnsupportedOperationException: Storage schema reading not
supported
+ ```
+
+ This can be resolved by adding `"get_schema_from_table" = "true"` in the
Catalog properties. This parameter instructs Doris to retrieve the schema
directly from the Hive table metadata instead of relying on the underlying
storage's Schema Reader.
+
+ ```sql
+ CREATE CATALOG hive PROPERTIES (
+ 'type' = 'hms',
+ 'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+ 'get_schema_from_table' = 'true'
+ );
+ ```
+
+ This parameter is supported since versions 2.1.10 and 3.0.6.
+
## HDFS
1. When accessing HDFS 3.x, if you encounter the error `java.lang.VerifyError:
xxx`, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need
to update to 2.10.2 or upgrade Doris to versions after 1.2.2.
@@ -322,6 +344,23 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
- Copy `hdfs-site.xml` and `core-site.xml` to `fe/conf` and `be/conf`.
(Recommended)
- In `hdfs-site.xml`, find the corresponding configuration
`dfs.data.transfer.protection` and set this parameter in the catalog.
+5. When querying a Hive Catalog table, an error occurs: `RPC response has a
length of xxx exceeds maximum data length`
+
+ For example:
+
+ ```
+ RPC response has a length of 1213486160 exceeds maximum data length
+ ```
+
+ The value `1213486160` in hexadecimal is `0x48545450`, which corresponds
to the ASCII string `"HTTP"`. This indicates that the Doris FE attempted to
connect to an HDFS NameNode RPC port, but received an HTTP response instead.
+
+ The root cause is that the HDFS NameNode port configured in the Catalog or
in `hdfs-site.xml` is incorrect — an HTTP port was used where an RPC port is
required. HDFS NameNode typically exposes two types of ports:
+
+ - **RPC port** (default: `8020` or `9000`): Used for HDFS client
communication (this is the correct port for Doris).
+ - **HTTP port** (default: `9870` or `50070`): Used for the NameNode Web UI.
+
+ Check the HDFS NameNode port configuration in the Catalog properties or in
`hdfs-site.xml` under `fe/conf` and `be/conf`, and ensure it is set to the RPC
port (`dfs.namenode.rpc-address`), not the HTTP port
(`dfs.namenode.http-address`).
+
## DLF Catalog
1. When using the DLF Catalog, if `Invalid address` occurs during BE reading
JindoFS data, add the domain name appearing in the logs to IP mapping in
`/etc/hosts`.
diff --git a/versioned_docs/version-4.x/faq/lakehouse-faq.md
b/versioned_docs/version-4.x/faq/lakehouse-faq.md
index 6bcfc6b7715..a5103977fde 100644
--- a/versioned_docs/version-4.x/faq/lakehouse-faq.md
+++ b/versioned_docs/version-4.x/faq/lakehouse-faq.md
@@ -253,6 +253,28 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
If the session timezone is already set to `Asia/Shanghai` but the query
still fails, it indicates that the ORC file was generated with the timezone
`+08:00`. During query execution, this timezone is required when parsing the
ORC footer. In this case, you can try creating a symbolic link under the
`/usr/share/zoneinfo/` directory that points `+08:00` to an equivalent timezone.
+14. When querying a Hive table that uses JSON SerDe (e.g.,
`org.openx.data.jsonserde.JsonSerDe`), an error occurs: `failed to get schema`
or `Storage schema reading not supported`
+
+ When a Hive table uses JSON format storage (ROW FORMAT SERDE is
`org.openx.data.jsonserde.JsonSerDe`), the Hive Metastore may not be able to
read the table's schema information through the default method, causing the
following error when querying from Doris:
+
+ ```
+ errCode = 2, detailMessage = failed to get schema for table xxx in db xxx.
+ reason: org.apache.hadoop.hive.metastore.api.MetaException:
+ java.lang.UnsupportedOperationException: Storage schema reading not
supported
+ ```
+
+ This can be resolved by adding `"get_schema_from_table" = "true"` in the
Catalog properties. This parameter instructs Doris to retrieve the schema
directly from the Hive table metadata instead of relying on the underlying
storage's Schema Reader.
+
+ ```sql
+ CREATE CATALOG hive PROPERTIES (
+ 'type' = 'hms',
+ 'hive.metastore.uris' = 'thrift://x.x.x.x:9083',
+ 'get_schema_from_table' = 'true'
+ );
+ ```
+
+ This parameter is supported since versions 2.1.10 and 3.0.6.
+
## HDFS
1. When accessing HDFS 3.x, if you encounter the error `java.lang.VerifyError:
xxx`, in versions prior to 1.2.1, Doris depends on Hadoop version 2.8. You need
to update to 2.10.2 or upgrade Doris to versions after 1.2.2.
@@ -322,6 +344,23 @@ ln -s
/etc/pki/ca-trust/extracted/openssl/ca-bundle.trust.crt /etc/ssl/certs/ca-
- Copy `hdfs-site.xml` and `core-site.xml` to `fe/conf` and `be/conf`.
(Recommended)
- In `hdfs-site.xml`, find the corresponding configuration
`dfs.data.transfer.protection` and set this parameter in the catalog.
+5. When querying a Hive Catalog table, an error occurs: `RPC response has a
length of xxx exceeds maximum data length`
+
+ For example:
+
+ ```
+ RPC response has a length of 1213486160 exceeds maximum data length
+ ```
+
+ The value `1213486160` in hexadecimal is `0x48545450`, which corresponds
to the ASCII string `"HTTP"`. This indicates that the Doris FE attempted to
connect to an HDFS NameNode RPC port, but received an HTTP response instead.
+
+ The root cause is that the HDFS NameNode port configured in the Catalog or
in `hdfs-site.xml` is incorrect — an HTTP port was used where an RPC port is
required. HDFS NameNode typically exposes two types of ports:
+
+ - **RPC port** (default: `8020` or `9000`): Used for HDFS client
communication (this is the correct port for Doris).
+ - **HTTP port** (default: `9870` or `50070`): Used for the NameNode Web UI.
+
+ Check the HDFS NameNode port configuration in the Catalog properties or in
`hdfs-site.xml` under `fe/conf` and `be/conf`, and ensure it is set to the RPC
port (`dfs.namenode.rpc-address`), not the HTTP port
(`dfs.namenode.http-address`).
+
## DLF Catalog
1. When using the DLF Catalog, if `Invalid address` occurs during BE reading
JindoFS data, add the domain name appearing in the logs to IP mapping in
`/etc/hosts`.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]