I have a Spark job which reads Hive data from S3 and use that data to generate HFile.
When I'm reading a single ORC file (about 190 MB), this job runs perfectly fine. However, when I tried to read the entire directory: about 400 ORC files, so about 76 GB files, it keeps throwing me: 17/06/12 04:46:36 ERROR server.TransportRequestHandler: Error sending result StreamResponse{streamId=/jars/importer-all.jar, byteCount=194729048, body=FileSegmentManagedBuffer{file=/tmp/importer-all.jar, offset=0, length=194729048}} to /10.211.127.118:44667; closing connection java.nio.channels.ClosedChannelException at io.netty.channel.AbstractChannel$AbstractUnsafe.close(...)(Unknown Source) 17/06/12 04:46:36 WARN scheduler.TaskSetManager: Lost task 316.1 in stage 0.0 (TID 521, ip-10-211-127-67.ap-northeast-2.compute.internal, executor 136): java.nio.channels.ClosedChannelException at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60) at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:251) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237) at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:230) at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1289) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:251) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237) at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:893) at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:691) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:408) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:455) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144) at java.lang.Thread.run(Thread.java:745) I saw the closest one that I could find on StackOverflow is this post <https://stackoverflow.com/questions/29781489/apache-spark-network-errors-between-executors?noredirect=1&lq=1>: but I did so, still no luck. Any help is greatly appreciated!