Hi, I'm reading and filtering large no of files using Spark. It's getting parallized at Spark Driver level only. How do i make it parallelize to Executor(Worker) Level. Refer the following sample. Is there any way to paralleling iterate the localIterator ?
Note : I use Java 1.7 version JavaRDD<String> files = javaSparkContext.parallelize(fileList) Iterator<String> localIterator = files.toLocalIterator(); Regards Vinoth Sankar