jordepic commented on code in PR #2441:
URL: https://github.com/apache/iceberg-rust/pull/2441#discussion_r3282278290


##########
dev/docker-compose.yaml:
##########
@@ -147,6 +147,50 @@ services:
       timeout: 5s
       retries: 5
 
+  # 
=============================================================================
+  # HDFS - single-node NameNode + DataNode for HDFS tests
+  # 
=============================================================================
+  # Mirrors apache/opendal's fixtures/hdfs/docker-compose-hdfs-cluster.yml:
+  # same bde2020 images, host networking on both services. Host networking
+  # is required because hdfs-native 0.13.5 connects to the DataNode by IP
+  # from `DatanodeIdProto.ip_addr` (not by hostname). On a docker bridge
+  # the DN would register with an unroutable bridge IP; host networking
+  # lets it bind directly on the host network namespace so the registered
+  # address is host-reachable.
+  #
+  # This works on Linux CI runners. On macOS / Windows Docker Desktop
+  # host networking has known issues (e.g. unresolvable VM hostname), so
+  # the HDFS integration tests are `#[ignore]`d; CI explicitly opts them
+  # in via `cargo nextest --run-ignored=only` (see .github/workflows/ci.yml).
+  hdfs-namenode:
+    image: bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8
+    network_mode: "host"
+    environment:
+      CLUSTER_NAME: iceberg-rust-test
+      CORE_CONF_fs_defaultFS: hdfs://localhost:8020
+      CORE_CONF_hadoop_http_staticuser_user: root
+      HDFS_CONF_dfs_permissions_enabled: false
+      HDFS_CONF_dfs_replication: 1
+    healthcheck:
+      test: ["CMD-SHELL", "hdfs dfsadmin -safemode get | grep -q OFF"]
+      interval: 5s
+      timeout: 5s
+      retries: 30
+      start_period: 30s
+
+  hdfs-datanode:
+    image: bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8

Review Comment:
   Fair point. We picked these because apache/opendal uses the exact same 
images and tags for their own services-hdfs-native integration tests 
(fixtures/hdfs/docker-compose-hdfs-cluster.yml in 
   apache/opendal). The thinking was: mirror their HDFS fixture exactly so 
iceberg-rust's HDFS test infra moves in lockstep with the OpenDAL crate we 
depend on.
   
   apache/hadoop:3.5.0 would be more current but isn't a drop-in. bde2020 ships 
an envtoconf.py helper that translates HDFS_CONF_* env vars into hdfs-site.xml 
properties at startup — apache/hadoop doesn't 
   have an equivalent, so we'd need to vendor static core-site.xml / 
hdfs-site.xml under dev/hdfs/, split the entrypoint into hdfs namenode / hdfs 
datanode commands, and add an ENSURE_NAMENODE_DIR 
   bootstrap step. Not a problem if you'd prefer that!
   
   I was mainly looking to stick in line with OpenDAL, but I can change this 
one, actually. The test is the same idea anyways, and we're just changing how 
the image works.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to