[
https://issues.apache.org/jira/browse/HUDI-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-6881:
---------------------------------
Labels: pull-request-available (was: )
> Hudi configured spark.scheduler.allocation.file should include scheme since
> Spark3.2
> ------------------------------------------------------------------------------------
>
> Key: HUDI-6881
> URL: https://issues.apache.org/jira/browse/HUDI-6881
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Wechar
> Priority: Major
> Labels: pull-request-available
>
> SPARK-35083 support remote scheduler pool file for Spark3.2, so the configure
> value should include scheme like:
> * hdfs://a/b/c
> * file://a/b/c
> Hudi {{SchedulerConfGenerator}} configure the generated local file name
> without scheme, for Spark3.2 or newer version, it could not find the config
> file:
> {code:bash}
> Caused by:
> org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File
> does not exist:
> /mnt//yarn/nm-local-dir/usercache/user/appcache/application_1695109514894_0504/container_e2463_1695109514894_0504_01_000001/tmp/ad680d18-28ea-412e-84e0-51dd77d924bd8230486245389606996.xml
> at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:87)
> at
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:77)
> at
> org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:169)
> at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2260)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)