Thanks Patrick, but i think i dint put my question clearly... The question is Say in the native file system or HDFS, i have data describing students who passed, failed and for whom results are with-held for some reason. *Time T1:* x - Pass y - Fail z - With-held.
*Time T2:* So i create an RDD1 reflecting this data, run a query to find how many candidates have passed. RESULT = 1. RDD1 is cached or its stored in the file system depending on the availability of space. *Time T3:* In the native file system, now that results of the z are out and declared passed. So HDFS will need to be modified. x - Pass y - Fail z - Pass. Say now i get the RDD1 that is there in file system or cached copy and run the same query, i get the RESULT = 1, but ideally RESULT is 2. So i was asking is there a way SPARK hints that RDD1 is no longer consistent with the file system or that its upto the programmer to recreate the RDD1 if the block from where RDD was created was changed at a later point of time. [T1 < T2 < T3 < T4] Thanks in advance... On Fri, Jan 17, 2014 at 1:42 AM, Patrick Wendell <[email protected]> wrote: > RDD's are immutable, so there isn't really such a thing as modifying a > block in-place inside of an RDD. As a result, this particular > consistency issue doesn't come up in Spark. > > - Patrick > > On Thu, Jan 16, 2014 at 1:42 AM, SaiPrasanna <[email protected]> > wrote: > > Hello, i am a novice to SPARK > > > > Say that we have created an RDD1 from native file system/HDFS and done > some > > transformations and actions and that resulted in an RDD2. Lets assume > RDD1 > > and RDD2 are persisted, cached in-memory. If the block from where RDD1 > was > > created was modified at time T1 and RDD1/RDD2 is accessed later at T2 > > T1, > > is there a way either SPARK ensures consistency or it is upto the > programmer > > to make it explicit? > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Consistency-between-RDD-s-and-Native-File-System-tp583.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > -- *Sai Prasanna. AN* *II M.Tech (CS), SSSIHL* *Entire water in the ocean can never sink a ship, Unless it gets inside.All the pressures of life can never hurt you, Unless you let them in.*
