Hi,

I am just running this simple example with
machineA: 1 master + 1 worker
machineB: 1 worker
«
    val ssc = new StreamingContext(sparkConf, Duration(1000))

    val rawStreams = (1 to numStreams).map(_
=>ssc.rawSocketStream[String](host, port,
StorageLevel.MEMORY_ONLY_SER)).toArray
    val union = ssc.union(rawStreams)

    union.filter(line => Random.nextInt(1) == 0).map(line => {
      var sum = BigInt(0)
      line.toCharArray.foreach(chr => sum += chr.toInt)
      fib2(sum)
      sum
    }).reduceByWindow(_+_, Seconds(1),Seconds(1)).map(s => s"### result:
$s").print()
»

And I'm getting the following exceptions:

Log from machineB
«
15/03/27 16:21:35 INFO CoarseGrainedExecutorBackend: Got assigned task 132
15/03/27 16:21:35 INFO Executor: Running task 0.0 in stage 27.0 (TID 132)
15/03/27 16:21:35 INFO CoarseGrainedExecutorBackend: Got assigned task 134
15/03/27 16:21:35 INFO Executor: Running task 2.0 in stage 27.0 (TID 134)
15/03/27 16:21:35 INFO TorrentBroadcast: Started reading broadcast variable
24
15/03/27 16:21:35 INFO CoarseGrainedExecutorBackend: Got assigned task 136
15/03/27 16:21:35 INFO Executor: Running task 4.0 in stage 27.0 (TID 136)
15/03/27 16:21:35 INFO CoarseGrainedExecutorBackend: Got assigned task 138
15/03/27 16:21:35 INFO Executor: Running task 6.0 in stage 27.0 (TID 138)
15/03/27 16:21:35 INFO CoarseGrainedExecutorBackend: Got assigned task 140
15/03/27 16:21:35 INFO Executor: Running task 8.0 in stage 27.0 (TID 140)
15/03/27 16:21:35 INFO MemoryStore: ensureFreeSpace(1886) called with
curMem=47117, maxMem=280248975
15/03/27 16:21:35 INFO MemoryStore: Block broadcast_24_piece0 stored as
bytes in memory (estimated size 1886.0 B, free 267.2 MB)
15/03/27 16:21:35 INFO BlockManagerMaster: Updated info of block
broadcast_24_piece0
15/03/27 16:21:35 INFO TorrentBroadcast: Reading broadcast variable 24 took
19 ms
15/03/27 16:21:35 INFO MemoryStore: ensureFreeSpace(3104) called with
curMem=49003, maxMem=280248975
15/03/27 16:21:35 INFO MemoryStore: Block broadcast_24 stored as values in
memory (estimated size 3.0 KB, free 267.2 MB)
15/03/27 16:21:35 ERROR Executor: Exception in task 8.0 in stage 27.0 (TID
140)
java.lang.Exception: Could not compute split, block input-0-1427473262420
not found
at org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
15/03/27 16:21:35 ERROR Executor: Exception in task 6.0 in stage 27.0 (TID
138)
java.lang.Exception: Could not compute split, block input-0-1427473262418
not found
at org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:701)
»

Log from machineA
«
15/03/27 16:21:35 INFO TorrentBroadcast: Reading broadcast variable 24 took
15 ms
15/03/27 16:21:35 INFO MemoryStore: ensureFreeSpace(3104) called with
curMem=269989249, maxMem=280248975
15/03/27 16:21:35 INFO MemoryStore: Block broadcast_24 stored as values in
memory (estimated size 3.0 KB, free 9.8 MB)
15/03/27 16:21:35 ERROR Executor: Exception in task 3.0 in stage 27.0 (TID
135)
java.lang.Exception: Could not compute split, block input-0-1427473262415
not found
        at org.apache.spark.rdd.BlockRDD.compute(BlockRDD.scala:51)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:87)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:280)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:247)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:56)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
»

It seems that blocks are not being shared between machines. Am I doing
something wrong?

Thanks,
Sergio

Reply via email to