[
https://issues.apache.org/jira/browse/SPARK-16989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416320#comment-15416320
]
Hong Shen commented on SPARK-16989:
-----------------------------------
Thanks for your comment.
In our cluster, the log directory is a unified config. the clusters have runned
for a long time, and the cluster often add and remove machine, we don't want to
restart the clusters for adding a log directory, so we want to add a config to
create the log directory.
> fileSystem.getFileStatus throw exception in EventLoggingListener
> ----------------------------------------------------------------
>
> Key: SPARK-16989
> URL: https://issues.apache.org/jira/browse/SPARK-16989
> Project: Spark
> Issue Type: Improvement
> Affects Versions: 2.0.0
> Reporter: Hong Shen
> Priority: Minor
>
> If log directory does not exist, it will throw exception as the follow log.
> {code}
> 16/05/02 22:24:22 ERROR spark.SparkContext: Error initializing SparkContext.
> java.io.FileNotFoundException: File
> file:/data/tdwadmin/tdwenv/tdwgaia/logs/sparkhistory/intermediate-done-dir
> does not exist
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
> at
> org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:109)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:612)
> at
> org.apache.spark.deploy.yarn.SQLApplicationMaster.<init>(SQLApplicationMaster.scala:78)
> at
> org.apache.spark.deploy.yarn.SQLApplicationMaster.<init>(SQLApplicationMaster.scala:46)
> at
> org.apache.spark.deploy.yarn.SQLApplicationMaster$$anonfun$main$1.apply$mcV$sp(SQLApplicationMaster.scala:311)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:70)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:69)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at
> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:69)
> at
> org.apache.spark.deploy.yarn.SQLApplicationMaster$.main(SQLApplicationMaster.scala:310)
> at
> org.apache.spark.deploy.yarn.SQLApplicationMaster.main(SQLApplicationMaster.scala)
> {code}
> {code}
> if (!fileSystem.getFileStatus(new Path(logBaseDir)).isDirectory) {
> throw new IllegalArgumentException(s"Log directory $logBaseDir does not
> exist.")
> }
> {code}
> There are two problems.
> 1 The judgment should add if fileSystem.exists(new Path(logBaseDir)), to
> prevent throw exception at getFileStatus.
> 2 I think we should add a choose let the applicaitonmaster to create log
> directory. Like this:
> {code}
> if (!fileSystem.exists(lp) &&
> sparkConf.getBoolean("spark.eventLog.create.if.baseDir.not.exist",
> true)) {
> fileSystem.mkdirs(lp)
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]