Ognen, can you comment if you were actually able to run two jobs concurrently with just restricting spark.cores.max? I run Shark on the same cluster and was not able to see a standalone job get in (since Shark is a "long running" job) until I restricted both spark.cores.max _and_ spark.executor.memory. Just curious if I did something wrong.
On Mon, Mar 24, 2014 at 7:48 PM, Ognen Duzlevski <og...@plainvanillagames.com> wrote: > Just so I can close this thread (in case anyone else runs into this stuff) - > I did sleep through the basics of Spark ;). The answer on why my job is in > waiting state (hanging) is here: > http://spark.incubator.apache.org/docs/latest/spark-standalone.html#resource-scheduling > > > Ognen > > On 3/24/14, 5:01 PM, Diana Carroll wrote: > > Ongen: > > I don't know why your process is hanging, sorry. But I do know that the way > saveAsTextFile works is that you give it a path to a directory, not a file. > The "file" is saved in multiple parts, corresponding to the partitions. > (part-00000, part-00001 etc.) > > (Presumably it does this because it allows each partition to be saved on the > local disk, to minimize network traffic. It's how Hadoop works, too.) > > > > > On Mon, Mar 24, 2014 at 5:00 PM, Ognen Duzlevski <og...@nengoiksvelzud.com> > wrote: >> >> Is someRDD.saveAsTextFile("hdfs://ip:port/path/final_filename.txt") >> supposed to work? Meaning, can I save files to the HDFS fs this way? >> >> I tried: >> >> val r = sc.parallelize(List(1,2,3,4,5,6,7,8)) >> r.saveAsTextFile("hdfs://ip:port/path/file.txt") >> >> and it is just hanging. At the same time on my HDFS it created file.txt >> but as a directory which has subdirectories (the final one is empty). >> >> Thanks! >> Ognen > > > > -- > "A distributed system is one in which the failure of a computer you didn't > even know existed can render your own computer unusable" > -- Leslie Lamport