You could also look at how the Spark Streaming DStream does what you
described.

Take a look at Spark StreamingContext.textFileStream implementation.
On Feb 18, 2014 8:02 PM, "David Thomas" <dt5434...@gmail.com> wrote:

> Perfect.
>
>
> On Tue, Feb 18, 2014 at 7:58 PM, Mayur Rustagi <mayur.rust...@gmail.com>wrote:
>
>> RDD is immutable so modification of RDD is not possible, you can generate
>> a new RDD unioning the two RDD created from new files and old in-memory RDD.
>> Regards
>> Mayur
>>
>> Mayur Rustagi
>> Ph: +919632149971
>> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
>> https://twitter.com/mayur_rustagi
>>
>>
>>
>> On Tue, Feb 18, 2014 at 6:33 PM, David Thomas <dt5434...@gmail.com>wrote:
>>
>>> Let's say I have an RDD of text files from HDFS. During the runtime, is
>>> it possible to check for new files in a particular directory and if
>>> present, add them to the existing RDD?
>>>
>>
>>
>

Reply via email to