I am loading a csv text file from s3 into spark, filtering and mapping the records and writing the result to s3.
I have tried several input sizes: 100k rows, 1M rows & 3.5M rows. The former two finish successfully while the latter (3.5M rows) hangs in some weird state in which the job stages monitor web app (the one in port 4040) stops , and the command line console gets stuck and does not even respond to ctrl-c. The Master's web monitoring app still responds and shows the state as FINISHED. In s3, I see an empty directory with a single zero-sized entry _temporary_$folder$. The s3 url is given using the s3n:// protocol. I did not see any error in the logs in the web console. I also tried several cluster sizes (1 master + 1 worker, 1 master + 5 workers) and got to the same state. Has anyone encountered such an issue? Any idea what's going on? I also posted this question to Stack Overflow: http://stackoverflow.com/questions/25226419/saveastextfile-to-s3-on-spark-does-not-work-just-hangs <http://stackoverflow.com/questions/25226419/saveastextfile-to-s3-on-spark-does-not-work-just-hangs> -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org