Hi, sorry for duplicates. First time user :) I keep getting fetchfailedexception 7337 port closed. Which is external shuffle service port. I was trying to tune these parameters. I have around 1000 executors and 5000 cores. I tried to set spark.shuffle.io.serverThreads to 2k. Should I also set spark.shuffle.io.clientThreads to 2000? Does shuffle client threads allow one executor to fetch from multiple nodes shuffle service?
Thanks On Fri, Aug 18, 2023 at 17:42 Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Hi, > > These two threads that you sent seem to be duplicates of each other? > > Anyhow I trust that you are familiar with the concept of shuffle in Spark. > Spark Shuffle is an expensive operation since it involves the following > > - > > Disk I/O > - > > Involves data serialization and deserialization > - > > Network I/O > > Basically these are based on the concept of map/reduce in Spark and these > parameters you posted relate to various aspects of threading and > concurrency. > > HTH > > > Mich Talebzadeh, > Solutions Architect/Engineering Lead > London > United Kingdom > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Fri, 18 Aug 2023 at 20:39, Nebi Aydin <nayd...@binghamton.edu.invalid> > wrote: > >> >> I want to learn differences among below thread configurations. >> >> spark.shuffle.io.serverThreads >> spark.shuffle.io.clientThreads >> spark.shuffle.io.threads >> spark.rpc.io.serverThreads >> spark.rpc.io.clientThreads >> spark.rpc.io.threads >> >> Thanks. >> >