advancedxy commented on code in PR #495:
URL: https://github.com/apache/incubator-uniffle/pull/495#discussion_r1071949746
##########
client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java:
##########
@@ -66,7 +67,8 @@ public RssShuffleDataIterator(
this.serializerInstance = serializer.newInstance();
this.shuffleReadClient = shuffleReadClient;
this.shuffleReadMetrics = shuffleReadMetrics;
- this.codec = Codec.newInstance(rssConf);
+ boolean compress =
rssConf.getBoolean(RssClientConfig.SPARK_SHUFFLE_COMPRESS, true);
Review Comment:
This is in client-spark module, I think we can refer spark.shuffle.compress
directly?
##########
client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java:
##########
@@ -140,6 +125,29 @@ public boolean hasNext() {
return recordsIterator.hasNext();
}
+ private int uncompress(CompressedShuffleBlock compressedBlock, ByteBuffer
compressedData) {
+ long compressedDataLength = compressedData.limit() -
compressedData.position();
+ compressedBytesLength += compressedDataLength;
+ shuffleReadMetrics.incRemoteBytesRead(compressedDataLength);
+
+ int uncompressedLen = compressedBlock.getUncompressLength();
+ if (codec != null) {
+ if (uncompressedData == null || uncompressedData.capacity() <
uncompressedLen) {
+ // todo: support off-heap bytebuffer
+ uncompressedData = ByteBuffer.allocate(uncompressedLen);
+ }
+ uncompressedData.clear();
+ long startDecompress = System.currentTimeMillis();
+ codec.decompress(compressedData, uncompressedLen, uncompressedData, 0);
+ unCompressedBytesLength += uncompressedLen;
+ long decompressDuration = System.currentTimeMillis() - startDecompress;
+ decompressTime += decompressDuration;
+ } else {
+ uncompressedData = compressedData;
Review Comment:
> unCompressedBytesLength += uncompressedLen;
L142 should also be updated in this else block
##########
client/src/main/java/org/apache/uniffle/client/util/RssClientConfig.java:
##########
@@ -86,4 +86,6 @@ public class RssClientConfig {
public static final String RSS_ESTIMATE_TASK_CONCURRENCY_PER_SERVER =
"rss.estimate.task.concurrency.per.server";
public static final int
RSS_ESTIMATE_TASK_CONCURRENCY_PER_SERVER_DEFAULT_VALUE = 80;
+ public static final String SPARK_SHUFFLE_COMPRESS = "shuffle.compress";
Review Comment:
this should be not necessary
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]