Hello, I have a spark cluster running in a single mode, master + 6 executors.
My application is reading a data from database via DataFrame.read then there is a filtering of rows. After that I re-partition data and I wonder why on the executors page of the driver UI I see RDD blocks all allocated still on single executor machine [image: Inline images 1] As highlighted on the picture above. I did expect that after re-partition the data will be shuffled across cluster but that is obviously not happening here. I can understand that database read is happening in non-parallel fashion but re-partition should fix it as far as I understand. Could someone experienced clarify that? Thanks