[ 
https://issues.apache.org/jira/browse/SPARK-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kousuke Saruta updated SPARK-2970:
----------------------------------

    Description: 
When spark-sql script run with spark.eventLog.enabled set true, it ends with 
IOException because FileLogger can not create APPLICATION_COMPLETE file in HDFS.

It's is because a shutdown hook of SparkSQLCLIDriver is executed after a 
shutdown hook of org.apache.hadoop.fs.FileSystem is executed.

When spark.eventLog.enabled is true, the hook of SparkSQLCLIDriver finally try 
to create a file to mark the application finished but the hook of FileSystem 
try to close FileSystem.

  was:
When spark-sql script run with spark.eventLog.enabled set true, it ends with 
IOException because FileLogger can not create APPLICATION_COMPLETE file in HDFS.
I think it's because FIleSystem is closed by HiveSessionImplWithUGI.
It has a code as follows.

{code}
  public void close() throws HiveSQLException {
    try {
    acquire();
    ShimLoader.getHadoopShims().closeAllForUGI(sessionUgi);
    cancelDelegationToken();
    } finally {
      release();
      super.close();
    }
  }
{code}

When using Hadoop 2.0+, ShimLoader.getHadoopShim above returns Hadoop23Shim 
which extends HadoopShimSecure.

HadoopShimSecure#closeAllForUGI is implemented as follows.

{code}
  @Override
  public void closeAllForUGI(UserGroupInformation ugi) {
    try {
      FileSystem.closeAllForUGI(ugi);
    } catch (IOException e) {
      LOG.error("Could not clean up file-system handles for UGI: " + ugi, e);
    }
  }
{code}




> spark-sql script ends with IOException when EventLogging is enabled
> -------------------------------------------------------------------
>
>                 Key: SPARK-2970
>                 URL: https://issues.apache.org/jira/browse/SPARK-2970
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.1.0
>         Environment: CDH5.1.0 (Hadoop 2.3.0)
>            Reporter: Kousuke Saruta
>
> When spark-sql script run with spark.eventLog.enabled set true, it ends with 
> IOException because FileLogger can not create APPLICATION_COMPLETE file in 
> HDFS.
> It's is because a shutdown hook of SparkSQLCLIDriver is executed after a 
> shutdown hook of org.apache.hadoop.fs.FileSystem is executed.
> When spark.eventLog.enabled is true, the hook of SparkSQLCLIDriver finally 
> try to create a file to mark the application finished but the hook of 
> FileSystem try to close FileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to