Hi Everyone, I took the existing docker demo of xTable and moved the examples and configuration to S3 + HMS.
You can see the completed example at https://github.com/alberttwong/incubator-xtable/tree/main/demo-s3 while it's going through PR review at https://github.com/apache/incubator-xtable/pull/459 More details: What is the purpose of the pull request Using the xtable docker demo as the base, modify it so it works with S3. End to End example with readme doc. Brief change log 1. added minio container images to provide an object store 2. changed HMS image to use the Starburst HMS image because Starburst has the S3 libraries already built in to the image. 3. built a custom spark 3.4 container image based on JDK 11 with hadoop 2.10.2 and hive 2.3.10 (can't use 2.3.1 due to hive 2.3.1 bug) installed. Available at https://hub.docker.com/r/atwong/openjdk-11-spark-3.4-hive-2.3.10-hadoop-2.10.2 if you dont' want to build it. 4. git clone hudi and compile mvn with JDK 8 so you can get the hudi-hive-sync jars (you can skip this through hudi-hive-sync-bundle on mvnrepository.com) 5. adding missing libraries to run run_sync_tool.sh. https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3, https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws, https://mvnrepository.com/artifact/com.esotericsoftware/kryo-shaded/4.0.2 , https://mvnrepository.com/artifact/org.apache.parquet/parquet-avro, https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-client 6. modifications to iceberg, hudi and delta Trino catalog configurations to support S3 bucket lookups 7. added core-site.xml to inject parameters to xtable and modified /etc/hadoop/core-site.xml to jnject parameters to hudi-hive-sync tool 8. Modified pyspark demo script to include S3 configs Regards, http://alberttwong.com <http://bit.ly/1H6mpmA> - +1-949-870-9664 - GPG: 9D0F 6E75 5363 0F39 F64A 447E 2A2E 6721 C637 845A <https://pgp.mit.edu/pks/lookup?op=get&search=0x2A2E6721C637845A>