dpengpeng opened a new issue, #8278: URL: https://github.com/apache/incubator-gluten/issues/8278
### Backend VL (Velox) ### Bug description I use PySpark to execute a SQL query on Iceberg data stored on HDFS, but the following exception occurs, but the same SQL can be run successfully using Java. My cluster environment has HDFS configuration information. Error message: **py4j.protocol.Py4JJavaError: An error occurred while calling o156.showString. : java.util.concurrent.ExecutionException: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 12) ( executor 3): org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Exception: VeloxRuntimeError Error Source: RUNTIME Error Code: INVALID_STATE Reason: Unable to connect to HDFS: nameservice, got error: InvalidParameter: Cannot parse URI: hdfs://nameservice, missing port or invalid HA configuration Caused by: HdfsConfigNotFound: Config key: dfs.ha.namenodes.nameservice not found. Retriable: False Expression: hdfsClient_ != nullptr Context: Split [Hive: hdfs://nameservice/spark/tpch_iceberg.db/supplier_ice/data/00000-116-410ef5e1-e2df-44fd-b67a-4a9410655fa1-00001.parquet 4 - 513211] Task Gluten_Stage_3_TID_12_VTID_1 Additional Context: Operator: TableScan[0] 0 Function: Impl File: Gluten/ep/build-velox/build/velox_ep/velox/connectors/hive/storage_adapters/hdfs/HdfsFileSystem.cpp Line: 37** Is there any standard guidance document for using Gluten in pyspark? ### Spark version Spark-3.4.x ### Spark configurations _No response_ ### System information os: centos7 spark: 3.4.1 ### Relevant logs _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
