Hi List,

I'm following this example  here
with the following:

$SPARK_HOME/bin/spark-submit \
  --deploy-mode cluster \
  --master spark://host.domain.ex:7077 \
  --class com.oreilly.learningsparkexamples.mini.scala.WordCount \

The jar is submitted fine and I can see it appear on the driver node (i.e.
connecting to and reading from HDFS ok):

-rw-r--r-- 1 nickt nickt  15K Mar 29 22:05
-rw-r--r-- 1 nickt nickt 9.2K Mar 29 22:05 stderr
-rw-r--r-- 1 nickt nickt    0 Mar 29 22:05 stdout

But it's failing due to a java.io.FileNotFoundException saying my input file
is missing:

Caused by: java.io.FileNotFoundException: Added file
does not exist.

I'm using sc.addFile("hdfs://path/to/the_file.txt") to propagate to all the
workers and sc.textFile(SparkFiles("the_file.txt")) to return the path to
the file on each of the hosts.

Has anyone come up against this before when reading from HDFS? No doubt I'm
doing something wrong.

Full trace below:

Launch Command: "/usr/java/java8/bin/java" "-cp"
"-Dakka.loglevel=WARNING" "-Dspark.driver.supervise=false"
"-Dspark.master=spark://host.domain.ex:7077" "-Xms512M" "-Xmx512M"

log4j:WARN No appenders could be found for logger
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
more info.
Using Spark's default log4j profile:
15/03/29 22:05:05 INFO SecurityManager: Changing view acls to: nickt
15/03/29 22:05:05 INFO SecurityManager: Changing modify acls to: nickt
15/03/29 22:05:05 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(nickt); users
with modify permissions: Set(nickt)
15/03/29 22:05:05 INFO Slf4jLogger: Slf4jLogger started
15/03/29 22:05:05 INFO Utils: Successfully started service 'Driver' on port
15/03/29 22:05:05 INFO WorkerWatcher: Connecting to worker
15/03/29 22:05:05 INFO SparkContext: Running Spark version 1.3.0
15/03/29 22:05:05 INFO SecurityManager: Changing view acls to: nickt
15/03/29 22:05:05 INFO SecurityManager: Changing modify acls to: nickt
15/03/29 22:05:05 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users with view permissions: Set(nickt); users
with modify permissions: Set(nickt)
15/03/29 22:05:05 INFO Slf4jLogger: Slf4jLogger started
15/03/29 22:05:05 INFO Utils: Successfully started service 'sparkDriver' on
port 33382.
15/03/29 22:05:05 INFO SparkEnv: Registering MapOutputTracker
15/03/29 22:05:05 INFO SparkEnv: Registering BlockManagerMaster
15/03/29 22:05:05 INFO DiskBlockManager: Created local directory at
15/03/29 22:05:05 INFO WorkerWatcher: Successfully connected to
15/03/29 22:05:05 INFO MemoryStore: MemoryStore started with capacity 265.1
15/03/29 22:05:05 INFO HttpFileServer: HTTP File server directory is
15/03/29 22:05:05 INFO HttpServer: Starting HTTP Server
15/03/29 22:05:05 INFO Server: jetty-8.y.z-SNAPSHOT
15/03/29 22:05:05 INFO AbstractConnector: Started
15/03/29 22:05:05 INFO Utils: Successfully started service 'HTTP file
server' on port 42484.
15/03/29 22:05:05 INFO SparkEnv: Registering OutputCommitCoordinator
15/03/29 22:05:06 INFO Server: jetty-8.y.z-SNAPSHOT
15/03/29 22:05:06 INFO AbstractConnector: Started
15/03/29 22:05:06 INFO Utils: Successfully started service 'SparkUI' on port
15/03/29 22:05:06 INFO SparkUI: Started SparkUI at
15/03/29 22:05:06 ERROR SparkContext: Jar not found at
15/03/29 22:05:06 INFO AppClient$ClientActor: Connecting to master
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Connected to Spark
cluster with app ID app-20150329220506-0027
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/0 on worker-20150329112422-host3.domain.ex-33765
(host3.domain.ex:33765) with 64 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/0 on hostPort host3.domain.ex:33765 with 64 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/1 on worker-20150329112422-host6.domain.ex-35464
(host6.domain.ex:35464) with 64 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/1 on hostPort host6.domain.ex:35464 with 64 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/2 on worker-20150329112422-host2.domain.ex-40914
(host2.domain.ex:40914) with 64 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/2 on hostPort host2.domain.ex:40914 with 64 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/3 on worker-20150329112421-host4.domain.ex-35927
(host4.domain.ex:35927) with 64 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/3 on hostPort host4.domain.ex:35927 with 64 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/4 on worker-20150329112422-host1.domain.ex-60546
(host1.domain.ex:60546) with 64 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/4 on hostPort host1.domain.ex:60546 with 64 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/5 on worker-20150329112421-host.domain.ex-59485
(host.domain.ex:59485) with 64 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/5 on hostPort host.domain.ex:59485 with 64 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor added:
app-20150329220506-0027/6 on worker-20150329112421-host5.domain.ex-40830
(host5.domain.ex:40830) with 63 cores
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: Granted executor ID
app-20150329220506-0027/6 on hostPort host5.domain.ex:40830 with 63 cores,
512.0 MB RAM
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/2 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/0 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/1 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/4 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/3 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/5 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/0 is now RUNNING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/1 is now RUNNING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/2 is now RUNNING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/6 is now LOADING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/3 is now RUNNING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/4 is now RUNNING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/5 is now RUNNING
15/03/29 22:05:06 INFO AppClient$ClientActor: Executor updated:
app-20150329220506-0027/6 is now RUNNING
15/03/29 22:05:06 INFO NettyBlockTransferService: Server created on 39447
15/03/29 22:05:06 INFO BlockManagerMaster: Trying to register BlockManager
15/03/29 22:05:06 INFO BlockManagerMasterActor: Registering block manager
host5.domain.ex:39447 with 265.1 MB RAM, BlockManagerId(<driver>,
host5.domain.ex, 39447)
15/03/29 22:05:06 INFO BlockManagerMaster: Registered BlockManager
15/03/29 22:05:06 INFO SparkDeploySchedulerBackend: SchedulerBackend is
ready for scheduling beginning after reached minRegisteredResourcesRatio:
Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.lang.reflect.Method.invoke(Method.java:483)
Caused by: java.io.FileNotFoundException: Added file
does not exist.
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1089)
    at org.apache.spark.SparkContext.addFile(SparkContext.scala:1065)
    ... 6 more

View this message in context: 
Sent from the Apache Spark User List mailing list archive at Nabble.com.

To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to