Josh Rosen created SPARK-5311:
---------------------------------

             Summary: EventLoggingListener throws exception if log directory 
does not exist
                 Key: SPARK-5311
                 URL: https://issues.apache.org/jira/browse/SPARK-5311
             Project: Spark
          Issue Type: Bug
    Affects Versions: 1.3.0
            Reporter: Josh Rosen
            Priority: Blocker


If the log directory does not exist, EventLoggingListener throws an 
IllegalArgumentException.  Here's a simple reproduction (using the master 
branch (1.3.0)):

{code}
./bin/spark-shell --conf spark.eventLog.enabled=true --conf 
spark.eventLog.dir=/tmp/nonexistent-dir
{code}

where /tmp/nonexistent-dir is a directory that doesn't exist and /tmp exists.  
This results in the following exception:

{code}
15/01/18 17:10:44 INFO HttpServer: Starting HTTP Server
15/01/18 17:10:44 INFO Utils: Successfully started service 'HTTP file server' 
on port 62729.
15/01/18 17:10:44 WARN Utils: Service 'SparkUI' could not bind on port 4040. 
Attempting port 4041.
15/01/18 17:10:44 INFO Utils: Successfully started service 'SparkUI' on port 
4041.
15/01/18 17:10:44 INFO SparkUI: Started SparkUI at http://joshs-mbp.att.net:4041
15/01/18 17:10:45 INFO Executor: Using REPL class URI: 
http://192.168.1.248:62726
15/01/18 17:10:45 INFO AkkaUtils: Connecting to HeartbeatReceiver: 
akka.tcp://sparkdri...@joshs-mbp.att.net:62728/user/HeartbeatReceiver
15/01/18 17:10:45 INFO NettyBlockTransferService: Server created on 62730
15/01/18 17:10:45 INFO BlockManagerMaster: Trying to register BlockManager
15/01/18 17:10:45 INFO BlockManagerMasterActor: Registering block manager 
localhost:62730 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 62730)
15/01/18 17:10:45 INFO BlockManagerMaster: Registered BlockManager
java.lang.IllegalArgumentException: Log directory /tmp/nonexistent-dir does not 
exist.
        at 
org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:90)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:363)
        at 
org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986)
        at $iwC$$iwC.<init>(<console>:9)
        at $iwC.<init>(<console>:18)
        at <init>(<console>:20)
        at .<init>(<console>:24)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:852)
        at 
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1125)
        at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:674)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:705)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:669)
        at 
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:828)
        at 
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:873)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:785)
        at 
org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:123)
        at 
org.apache.spark.repl.SparkILoopInit$$anonfun$initializeSpark$1.apply(SparkILoopInit.scala:122)
        at org.apache.spark.repl.SparkIMain.beQuietDuring(SparkIMain.scala:270)
        at 
org.apache.spark.repl.SparkILoopInit$class.initializeSpark(SparkILoopInit.scala:122)
        at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:60)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$apply$mcZ$sp$5.apply$mcV$sp(SparkILoop.scala:945)
        at 
org.apache.spark.repl.SparkILoopInit$class.runThunks(SparkILoopInit.scala:147)
        at org.apache.spark.repl.SparkILoop.runThunks(SparkILoop.scala:60)
        at 
org.apache.spark.repl.SparkILoopInit$class.postInitialization(SparkILoopInit.scala:106)
        at 
org.apache.spark.repl.SparkILoop.postInitialization(SparkILoop.scala:60)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:962)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
        at 
org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:916)
        at 
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:916)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1011)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:365)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
{code}

It looks like the directory existence check was introduced in 
https://github.com/apache/spark/commit/456451911d11cc0b6738f31b1e17869b1fb51c87?diff=unified.
  This is a change of behavior / regression from earlier Spark versions, which 
would create the event log directory if it did not exist.

I think the intent of this check may have been to handle cases where the event 
directory path corresponds to an existing file, so maybe we can guard the 
`!isDirectory` check with an `exists` check first and change the error message 
to be more specific.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to