Hello, I currently have a task always failing with "java.io.FileNotFoundException: [...]/shuffle_0_257_2155 (Too many open files)" when I run sorting operations such as distinct, sortByKey, or reduceByKey on a large number of partitions.
Im working with 365 GB of data which is being split into 5959 partitions. The cluster Im using has over 1000GB of memory with 20GB of memory per node. I have tried adding .set("spark.shuffle.consolidate.files", "true") when making my spark context but it doesnt seem to make a difference. Has anyone else had similar problems? Best regards, Matt