jerqi commented on code in PR #74:
URL: https://github.com/apache/incubator-uniffle/pull/74#discussion_r932403800
##########
client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java:
##########
@@ -106,8 +107,17 @@ public boolean hasNext() {
shuffleReadMetrics.incFetchWaitTime(fetchDuration);
if (compressedData != null) {
shuffleReadMetrics.incRemoteBytesRead(compressedData.limit() -
compressedData.position());
+ // Directbytebuffers are not collected in time will cause executor
easy
+ // be killed by cluster managers(such as YARN) for using too much
offheap memory
+ if (uncompressedData != null && uncompressedData.isDirect()) {
+ try {
+ RssShuffleUtils.destroyDirectByteBuffer(uncompressedData);
+ } catch (Exception e) {
+ throw new RuntimeException("Destroy DirectByteBuffer failed!", e);
Review Comment:
RuntimeException -> RssException
For Spark Client, we usually throw a RssException. So we can count how many
fail tasks are due to rss by the type of exception.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]