You could also look at how the Spark Streaming DStream does what you described.
Take a look at Spark StreamingContext.textFileStream implementation. On Feb 18, 2014 8:02 PM, "David Thomas" <dt5434...@gmail.com> wrote: > Perfect. > > > On Tue, Feb 18, 2014 at 7:58 PM, Mayur Rustagi <mayur.rust...@gmail.com>wrote: > >> RDD is immutable so modification of RDD is not possible, you can generate >> a new RDD unioning the two RDD created from new files and old in-memory RDD. >> Regards >> Mayur >> >> Mayur Rustagi >> Ph: +919632149971 >> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com >> https://twitter.com/mayur_rustagi >> >> >> >> On Tue, Feb 18, 2014 at 6:33 PM, David Thomas <dt5434...@gmail.com>wrote: >> >>> Let's say I have an RDD of text files from HDFS. During the runtime, is >>> it possible to check for new files in a particular directory and if >>> present, add them to the existing RDD? >>> >> >> >