[
https://issues.apache.org/jira/browse/HUDI-8559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17900500#comment-17900500
]
Lin Liu commented on HUDI-8559:
-------------------------------
After the pr, the error message is gone:
{code:java}
[hadoop@ip-10-0-88-40 ~]$ cd spark-3.5.3-bin-hadoop3/
[hadoop@ip-10-0-88-40 spark-3.5.3-bin-hadoop3]$ export
HADOOP_CONF_DIR=/etc/hadoop/conf
[hadoop@ip-10-0-88-40 spark-3.5.3-bin-hadoop3]$ export
YARN_CONF_DIR=/etc/hadoop/conf
[hadoop@ip-10-0-88-40 spark-3.5.3-bin-hadoop3]$ export
HUDI_CONF_DIR=/etc/hudi/conf
[hadoop@ip-10-0-88-40 spark-3.5.3-bin-hadoop3]$ hudi_version=1.0.0-SNAPSHOT
[hadoop@ip-10-0-88-40 spark-3.5.3-bin-hadoop3]$
SPARK_HOME="/home/hadoop/spark-3.5.3-bin-hadoop3"
[hadoop@ip-10-0-88-40 spark-3.5.3-bin-hadoop3]$ ./bin/spark-sql \
spark-sql (default)> ./bin/spark-sql \
> --master yarn \
> --deploy-mode client \
> --driver-memory 10g \
> --executor-memory 10g \
> --num-executors 3 \
> --executor-cores 4 \
> --jars
/home/hadoop/hudi-spark3.5-bundle_2.12-1.0.0-SNAPSHOT.jar \
> --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer \
> --conf
spark.driver.extraJavaOptions="-Dlog4j.configuration=file:/home/hadoop/warn.log4j.properties"
\
> --conf
spark.executor.extraJavaOptions="-Dlog4j.configuration=file:/home/hadoop/warn.log4j.properties"
\
> --conf spark.kryoserializer.buffer=256m \
> --conf spark.kryoserializer.buffer.max=1024m \
> --conf spark.rdd.compress=true \
> --conf spark.memory.storageFraction=0.8 \
> --conf "spark.driver.defaultJavaOptions=-XX:+UseG1GC"
\
> --conf
"spark.executor.defaultJavaOptions=-XX:+UseG1GC" \
> --conf spark.ui.proxyBase="" \
> --conf 'spark.eventLog.enabled=true' --conf
'spark.eventLog.dir=hdfs:///var/log/spark/apps' \
> --conf
spark.hadoop.yarn.timeline-service.enabled=false \
> --conf spark.driver.userClassPathFirst=true \
> --conf spark.executor.userClassPathFirst=true \
> --conf "spark.sql.hive.convertMetastoreParquet=false"
\
> --conf spark.sql.catalogImplementation=in-memory \
> --conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' \
> --conf
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
> ;
spark-sql (default)>
> CREATE TABLE hudi_table (
> ts BIGINT,
> uuid STRING,
> rider STRING,
> driver STRING,
> fare DOUBLE,
> city STRING
> ) USING HUDI
> PARTITIONED BY (city);
24/11/22 21:44:29 WARN TableSchemaResolver: Could not find any data file
written for commit, so could not get schema for table
file:/home/hadoop/spark-3.5.3-bin-hadoop3/spark-warehouse/hudi_table
Time taken: 1.525 seconds {code}
> Spark SQL commands fail on AWS EMR
> ----------------------------------
>
> Key: HUDI-8559
> URL: https://issues.apache.org/jira/browse/HUDI-8559
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Y Ethan Guo
> Assignee: Lin Liu
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.0.0
>
>
> With EMR Spark the below error is thrown for any SQL commands with master.
> {code:java}
> java.lang.IllegalStateException: Loop detected; file
> file:/etc/hudi/conf/hudi-defaults.conf already referenced
> at
> org.apache.hudi.common.config.DFSPropertiesConfiguration.addPropsFromFile(DFSPropertiesConfiguration.java:147)
> at
> org.apache.hudi.common.config.DFSPropertiesConfiguration.loadGlobalProps(DFSPropertiesConfiguration.java:127)
> at
> org.apache.hudi.common.config.DFSPropertiesConfiguration.<clinit>(DFSPropertiesConfiguration.java:71)
> at
> org.apache.spark.sql.hudi.ProvidesHoodieConfig$.isSchemaEvolutionEnabled(ProvidesHoodieConfig.scala:594)
> at
> org.apache.spark.sql.hudi.Spark34ResolveHudiAlterTableCommand.apply(Spark34ResolveHudiAlterTableCommand.scala:35)
> at
> org.apache.spark.sql.hudi.Spark34ResolveHudiAlterTableCommand.apply(Spark34ResolveHudiAlterTableCommand.scala:32)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:239)
> at
> scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
> at
> scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
> at scala.collection.immutable.List.foldLeft(List.scala:91)
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:236)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)