[
https://issues.apache.org/jira/browse/SPARK-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kousuke Saruta updated SPARK-2970:
----------------------------------
Description:
When spark-sql script run with spark.eventLog.enabled set true, it ends with
IOException because FileLogger can not create APPLICATION_COMPLETE file in HDFS.
It's is because a shutdown hook of SparkSQLCLIDriver is executed after a
shutdown hook of org.apache.hadoop.fs.FileSystem is executed.
When spark.eventLog.enabled is true, the hook of SparkSQLCLIDriver finally try
to create a file to mark the application finished but the hook of FileSystem
try to close FileSystem.
was:
When spark-sql script run with spark.eventLog.enabled set true, it ends with
IOException because FileLogger can not create APPLICATION_COMPLETE file in HDFS.
I think it's because FIleSystem is closed by HiveSessionImplWithUGI.
It has a code as follows.
{code}
public void close() throws HiveSQLException {
try {
acquire();
ShimLoader.getHadoopShims().closeAllForUGI(sessionUgi);
cancelDelegationToken();
} finally {
release();
super.close();
}
}
{code}
When using Hadoop 2.0+, ShimLoader.getHadoopShim above returns Hadoop23Shim
which extends HadoopShimSecure.
HadoopShimSecure#closeAllForUGI is implemented as follows.
{code}
@Override
public void closeAllForUGI(UserGroupInformation ugi) {
try {
FileSystem.closeAllForUGI(ugi);
} catch (IOException e) {
LOG.error("Could not clean up file-system handles for UGI: " + ugi, e);
}
}
{code}
> spark-sql script ends with IOException when EventLogging is enabled
> -------------------------------------------------------------------
>
> Key: SPARK-2970
> URL: https://issues.apache.org/jira/browse/SPARK-2970
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.1.0
> Environment: CDH5.1.0 (Hadoop 2.3.0)
> Reporter: Kousuke Saruta
>
> When spark-sql script run with spark.eventLog.enabled set true, it ends with
> IOException because FileLogger can not create APPLICATION_COMPLETE file in
> HDFS.
> It's is because a shutdown hook of SparkSQLCLIDriver is executed after a
> shutdown hook of org.apache.hadoop.fs.FileSystem is executed.
> When spark.eventLog.enabled is true, the hook of SparkSQLCLIDriver finally
> try to create a file to mark the application finished but the hook of
> FileSystem try to close FileSystem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]