In a similar vein, it would be helpful to have an Iterable way to access the data inside an RDD. The collect method takes everything in the RDD and puts in a list, but this blows up memory. Since everything I want is already inside the RDD, it could be easy to iterate over the content without replicating the array.
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-create-RDD-from-Java-in-memory-data-tp2486p2568.html Sent from the Apache Spark User List mailing list archive at Nabble.com.