Hi, I'm think I may have encountered some kind of bug that at the moment prevents the correct running of my application on a EC2 Cluster. I'm saying that because the same exact code works wonderfully locally but has a really strange behaviour on the cluster. val uri = ssc.textFileStream(args(1) + "/inputData/newData/") uri.print() // prints perfectly the data uri.saveAsTextFiles((args(1) + "/uri/textFiles/"), "") // saves the file as intended val downloaded = uri.map(s => download(s)) .flatMap(t => t).map(t => createKey(t)) val downloadedAndFiltered = downloaded.filter(t => filterEcho(t)) downloadedAndFilteredAndEchoFiltered(t => t._2).saveAsTextFiles((args(1) + "/dissected/textFiles/"), "") // saves an empty file(why???) downloadedAndFilteredAndEchoFiltered.print() // prints perfectly // from now all data gets lost, any further call on downloadedAndFilteredAndEchoFiltered DStream receive empty input
I have no idea of what it could be that breaks down my application, someone knows if there are some known bug? Gianluca