[
https://issues.apache.org/jira/browse/SPARK-41163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-41163:
---------------------------------
Fix Version/s: (was: 3.2.3)
> Spark 3.2.2 storage.ShuffleBlockFetcherIterator and TransportResponseHandler
> issue
> ----------------------------------------------------------------------------------
>
> Key: SPARK-41163
> URL: https://issues.apache.org/jira/browse/SPARK-41163
> Project: Spark
> Issue Type: Bug
> Components: Build, Deploy
> Affects Versions: 3.2.2
> Environment: * spark 3.2.2
> * hadoop 3.1.2
> * hive 3.1.1
> * scala 2.12
> Reporter: Dmitry Kravchuk
> Priority: Major
> Attachments: container_1668606650061_0087_01_000057.txt, pom.xml
>
>
> Hello there.
> I've build spark 3.2.2 for my cluster which has hadoop 3.1.2 and scala 2.12
> (pom.xml is attached).
> build script:
>
> {code:java}
> cd spark && \
> ./build/mvn -Pyarn -Dhadoop.version=3.1.2 -Pscala-2.12 -Phive
> -Phive-thriftserver -DskipTests clean package {code}
>
> It was working fine but a few applications has got strage error and warning
> form time to time.
> It always looks like datanode connection lost and shuffle reading issues.
> {code:java}
> 2022-11-16 22:18:25,423 ERROR server.TransportChannelHandler: Connection to
> s00abd02node9.company.com/10.x.y.163:35143 has been quiet for 120000 ms while
> there are outstanding requests. Assuming connection is dead; please adjust
> spark.shuffle.io.connectionTimeout if this is wrong.
> 2022-11-16 22:18:25,423 ERROR client.TransportResponseHandler: Still have 5
> requests outstanding when connection from
> s00abd02node9.company.com/10.x.y.163:35143 is closed
> 2022-11-16 22:18:25,423 WARN netty.NettyBlockTransferService: Error while
> trying to get the host local dirs for [16]
> 2022-11-16 22:18:25,425 ERROR storage.ShuffleBlockFetcherIterator: Error
> occurred while fetching host local blocks {code}
> So when it happend application will go to retry and fail after 2nd start.
> Can anybody help?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]