This is an automated email from the ASF dual-hosted git repository.

apitrou pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 71293212f2 GH-47560: [C++] Fix host handling for default HDFS URI 
(#47458)
71293212f2 is described below

commit 71293212f2006d6b224fa7d0658c1f6d51689b83
Author: Diego Sevilla Ruiz <[email protected]>
AuthorDate: Tue Sep 23 18:52:30 2025 +0200

    GH-47560: [C++] Fix host handling for default HDFS URI (#47458)
    
    ### Rationale for this change
    
    In #25324 a fix is introduced for the python HadoopFileSystem, but it does 
not work if you use `from_uri()`, as it is passed to the underlying C++ 
implementation of the options parsing. The "default" case is not handled as in 
the python case, as the whole "hdfs://default" is passed to the underlying hdfs 
library, that expect "default" to search in `$HADOOP_CONF_DIR/core-site.xml`.
    
    ### What changes are included in this PR?
    
    Handle the `HadoopFileSystem.from_uri()` (or `FileSystem.from_uri()` when 
using `hdfs://default:xxx`) special HDFS URIs.
    
    ### Are these changes tested?
    
    There are no specific tests for this feature, but existing HDFS CI jobs 
pass.
    
    ### Are there any user-facing changes?
    
    Not exactly, but the documentation is honored for the `from_uri()` case.
    
    * GitHub Issue: #47560
    
    Lead-authored-by: Diego Sevilla Ruiz <[email protected]>
    Co-authored-by: Antoine Pitrou <[email protected]>
    Signed-off-by: Antoine Pitrou <[email protected]>
---
 cpp/src/arrow/filesystem/hdfs.cc | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/cpp/src/arrow/filesystem/hdfs.cc b/cpp/src/arrow/filesystem/hdfs.cc
index d59b2a342d..adb8b0d50d 100644
--- a/cpp/src/arrow/filesystem/hdfs.cc
+++ b/cpp/src/arrow/filesystem/hdfs.cc
@@ -363,8 +363,14 @@ Result<HdfsOptions> HdfsOptions::FromUri(const Uri& uri) {
     options_map.emplace(kv.first, kv.second);
   }
 
+  // Special case host = "default" or "hdfs://default" as stated by GH-47560.
+  // If given the string "default", libhdfs selects the default filesystem
+  // from `core-site.xml`.
   std::string host;
-  host = uri.scheme() + "://" + uri.host();
+  if (uri.host() == "default")
+    host = uri.host();
+  else
+    host = uri.scheme() + "://" + uri.host();
 
   // configure endpoint
   const auto port = uri.port();

Reply via email to