Tyler-Rendina commented on issue #10590: URL: https://github.com/apache/hudi/issues/10590#issuecomment-1992114835
Final note, apologies for the amount of posts, but this may help EMR users with Glue as their Hive service. Make sure to build Hudi using Java 8, if you are on ARM use something like Azul OpenJDK and export $JAVA_HOME as the provided path, i.e., /Library/Java/JavaVirtualMachines/zulu-8.jdk/Contents/Home. Once you upload your jars to s3, bootstrap such as: ``` sudo chown -R $USER:root /usr/lib/hudi sudo chmod -R ugo+rw /usr/lib/hudi aws s3 cp s3://BUCKET/jars/hudi-aws-bundle-0.14.1.jar /usr/lib/hudi aws s3 cp s3://BUCKET/jars/hudi-spark3.3-bundle_2.12-0.14.1.jar /usr/lib/hudi sudo ln -sf /usr/lib/hudi/hudi-aws-bundle-0.14.1.jar hudi-aws-bundle.jar sudo ln -sf /usr/lib/hudi/hudi-spark3.3-bundle_2.12-0.14.1.jar hudi-spark3.3-bundle.jar ``` To use your custom built Hudi package, conform to your bootstrap paths in the following spark submit command element: ``` "--jars", "/usr/lib/hudi/hudi-aws-bundle-0.14.1.jar,/usr/lib/hudi/hudi-spark3.3-bundle_2.12-0.14.1.jar,", "--conf", "spark.driver.extraClassPath=/usr/lib/hudi/hudi-aws-bundle-0.14.1.jar:/usr/lib/hudi/hudi-spark3.3-bundle_2.12-0.14.1.jar", "--conf", "spark.executor.extraClassPath=/usr/lib/hudi/hudi-aws-bundle-0.14.1.jar:/usr/lib/hudi/hudi-spark3.3-bundle_2.12-0.14.1.jar", ``` Finally, (the issue I had with Class Not Found com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory), **DO NOT** use .enableHiveSupport() when setting your spark context, while this works with hudi imported with --packages, it will try to use the wrong hive package when you specify --jars. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
