Does mapPartitions keep complete partitions in memory of executor as iterable.
JavaRDD<String> rdd = jsc.textFile("path"); JavaRDD<Integer> output = rdd.mapPartitions(new FlatMapFunction<Iterator<String>, Integer>() { public Iterable<Integer> call(Iterator<String> input) throws Exception { List<Integer> output = new ArrayList<Integer>(); while(input.hasNext()){ output.add(input.next().length()); } return output; } }); Here does input is present in memory and can contain complete partition of gbs ? Will this function call(Iterator<String> input) is called only for no of partitions(say if I have 10 in this example) times. Not no of lines times(say 10000000) . And whats the use of mapPartitionsWithIndex ? Thanks