This is an automated email from the ASF dual-hosted git repository.
xuanwo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/opendal.git
The following commit(s) were added to refs/heads/main by this push:
new 60dcec96d1 docs: Add WebHDFS version compatibility details (#4024)
60dcec96d1 is described below
commit 60dcec96d11615b02df78e5befc8e65db1ead379
Author: Shubham Raizada <[email protected]>
AuthorDate: Fri Jan 19 13:39:38 2024 +0530
docs: Add WebHDFS version compatibility details (#4024)
add webhdfs version compatibility
---
core/src/services/webhdfs/docs.md | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/core/src/services/webhdfs/docs.md
b/core/src/services/webhdfs/docs.md
index 497c46a9dc..c9e1610e07 100644
--- a/core/src/services/webhdfs/docs.md
+++ b/core/src/services/webhdfs/docs.md
@@ -23,12 +23,34 @@ This service can be used to:
[Hdfs][crate::services::Hdfs] is powered by HDFS's native java client. Users
need to set up the HDFS services correctly. But webhdfs can access from HTTP
API and no extra setup needed.
+## WebHDFS Compatibility Guidelines
+
+### File Creation and Write
+
+For [File creation and
write](https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Create_and_Write_to_a_File)
operations,
+OpenDAL WebHDFS is optimized for Hadoop Distributed File System (HDFS)
versions 2.9 and later.
+This involves two API calls in webhdfs, where the initial `put` call to the
namenode is redirected to the datanode handling the file data.
+The optional `noredirect` flag can be set to prevent redirection. If used, the
API response body contains the datanode URL, which is then utilized for the
subsequent `put` call with the actual file data.
+OpenDAL automatically sets the `noredirect` flag with the first `put` call.
This feature is supported starting from HDFS version 2.9.
+
+### Multi-Write Support
+
+OpenDAL WebHDFS supports multi-write operations by creating temporary files in
the specified `atomic_write_dir`.
+The final concatenation of these temporary files occurs when the writer is
closed.
+However, it's essential to be aware of HDFS concat restrictions for earlier
versions,
+where the target file must not be empty, and its last block must be full. Due
to these constraints, the concat operation might fail for HDFS 2.6.
+This issue, identified as
[HDFS-6641](https://issues.apache.org/jira/browse/HDFS-6641), has been
addressed in later versions of HDFS.
+
+In summary, OpenDAL WebHDFS is designed for optimal compatibility with HDFS,
specifically versions 2.9 and later.
+
+
+
## Configurations
- `root`: The root path of the WebHDFS service.
- `endpoint`: The endpoint of the WebHDFS service.
- `delegation`: The delegation token for WebHDFS.
-- `atomic_write_dir`: The tmp write dir of multi write for WebHDFS.
+- `atomic_write_dir`: The tmp write dir of multi write for WebHDFS.Needs to be
configured for multi write support.
Refer to [`Builder`]'s public API docs for more information.