Let's say I'm given 2 RDDs and told to store them in a sequence file and
they have the following dependency:

val rdd1 = sparkContext.sequenceFile().....cache()
val rdd2 = rdd1.map(....)....


How would I tell programmatically without being the one who built rdd1 and
rdd2 whether or not rdd2 depends on rdd1?

I'm working on a concurrency model for my application and I won't
necessarily know how the two rdds are constructed. What I will know is
whether or not rdd1 is cached but i want to maximum concurrency and run
rdd1 and rdd2 together if rdd2 does not depend on rdd1.

Reply via email to