Hi there,

I have a large amount of objects, which I have to partition into chunks with
the help of a binary tree: after each object has been run through the tree,
the leaves of that tree contain the chunks. Next I have to process each of
those chunks in the same way with a function f(chunk). So I thought if I
could make the list of chunks into an RDD listOfChunks, I could use Spark by
calling listOfChunks.map(f) and do the processing in parallel.

What would you recommend how I create the RDD? Is it possible to start with
an RDD that is a list of empty chunks and then to add my objects one by one
to the belonging chunks? Or would you recommend something else?

Thanks!





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/creation-of-RDD-from-a-Tree-tp23310.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to