Hello list, I am sorry for sending this message here, but I could not manage to get any response in “users”. For specific purposes I would like to isolate 1 partition of the RDD and perform computations only to this.
For instance, suppose that a user asks Spark to create 500 partitions for the RDD. I would like Spark to create the partitions but perform computations only in one partition from those 500 ignoring the other 499. At first I tried to modify executor in order to run only 1 partition (task) but I didn’t manage to make it work. Then I tried the DAG Scheduler but I think that I should modify the code in a higher level and let Spark make the partitioning but at the end see only one partition and throw throw away all the others. My question is which file should I modify in order to achieve isolating 1 partition of the RDD? Where does the actual partitioning is made? I hope it is clear! Thank you very much, Thodoris --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org