Hi, I was able to resolve the issue with increasing the timeout and reducing the number of executors and increasing number of cores per executor.
The issue is resolved but I am still not sure why reducing the number of executors and increasing number of cores per executor fixed issues related to shuffle reads. I understand that now with more cores per executor the executor threads are getting less memory but earlier too there were no GC issues. Can some one help me understand how this setting could help and what are the things we should consider for selecting the right ratio of executors to cores? Thanks Ankur On Tue, Oct 11, 2016 at 11:16 PM, Ankur Srivastava < ankur.srivast...@gmail.com> wrote: > Hi, > > I am upgrading my jobs to Spark 1.6 and am running into shuffle issues. I > have tried all options and now am falling back to legacy memory model but > still running into same issue. > > I have set spark.shuffle.blockTransferService to nio. > > 16/10/12 06:00:10 INFO MapOutputTrackerMaster: Size of output statuses for > shuffle 6 is 57910504 bytes > > 16/10/12 06:00:10 INFO MapOutputTrackerMasterEndpoint: Asked to send map > output locations for shuffle 6 to dedwfprshd023.de.neustar.com:52510 > > 16/10/12 06:00:10 ERROR TransportRequestHandler: Error sending result > RpcResponse{requestId=7357168459686146036, > body=NioManagedBuffer{buf=java.nio.HeapByteBuffer[pos=0 > lim=57910531 cap=57910531]}} to <hostname>; closing connection > > > Any other settings that I should tune? > > > Thanks > > Ankur >