You can try playing with spark.streaming.blockInterval so that it wont consume a lot of data, default value is 200ms
Thanks Best Regards On Fri, Mar 20, 2015 at 8:49 PM, jamborta <jambo...@gmail.com> wrote: > Hi all, > > We are designing a workflow where we try to stream local files to a Socket > streamer, that would clean and process the files and write them to hdfs. We > have an issue with bigger files when the streamer cannot keep up with the > data, and runs out of memory. > > What would be the best way to implement an approach where the Socket stream > receiver would notify the stream not to send more data (stop reading from > disk too?), just before it might run out of memory? > > thanks, > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Buffering-for-Socket-streams-tp22164.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >