zhengchenyu commented on code in PR #2405: URL: https://github.com/apache/uniffle/pull/2405#discussion_r2026417638
########## client-spark/common/src/main/java/org/apache/uniffle/shuffle/manager/RssShuffleManagerBase.java: ########## @@ -1538,6 +1546,48 @@ public boolean isValidTask(String taskId) { return !failedTaskIds.contains(taskId); } + protected <K> String encode(Ordering<K> obj) { + try { + ByteArrayOutputStream baos = new ByteArrayOutputStream(); + ObjectOutputStream oos = new ObjectOutputStream(baos); + oos.writeObject(obj); + oos.close(); + return "#" + Base64.getEncoder().encodeToString(baos.toByteArray()); + } catch (Exception e) { + throw new RssException(e); + } + } + + protected <K, V, C> MergeContext buildMergeContext(ShuffleDependency<K, V, C> dependency) { + if (remoteMergeEnable) { + MergeContext mergeContext = + MergeContext.newBuilder() + .setKeyClass(dependency.keyClassName()) + .setValueClass( + dependency.mapSideCombine() + ? dependency.combinerClassName().get() + : dependency.valueClassName()) + .setComparatorClass( + dependency + .keyOrdering() Review Comment: The current implementation cannot enable remote merge based on whether keyOrdering is empty, and can only be done through configuration. For spark rdd, there may be such a situation (not for spark sql), keyOrdering is empty, but combineclass is not empty, it will be sorted by hash and then combined in memory. This can be further improved, especially for spark sql. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@uniffle.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@uniffle.apache.org For additional commands, e-mail: issues-h...@uniffle.apache.org