PHILO-HE commented on code in PR #10537:
URL:
https://github.com/apache/incubator-gluten/pull/10537#discussion_r2306782838
##########
backends-clickhouse/src-celeborn/main/scala/org/apache/spark/shuffle/CHCelebornColumnarBatchSerlizerFactory.scala:
##########
Review Comment:
Please also fix the file name.
##########
gluten-celeborn/src/main/java/org/apache/spark/shuffle/gluten/celeborn/CelebornShuffleManager.java:
##########
@@ -95,21 +95,19 @@ public class CelebornShuffleManager
celebornColumnarBatchSerializerFactoriesLoader.iterator(),
CelebornColumnarBatchSerializerFactory.class))
.collect(Collectors.toList());
- // for now, we ignore check since CH backend has not support this
feature yet.
- // Preconditions.checkState(
- // !columnarBatchSerializerFactoryList.isEmpty(),
- // "No factory found for Celeborn columnar batch serializer");
- final Map<String, CelebornColumnarBatchSerializerFactory>
columanrBatchSerilizerFactoryMap =
+
+ Preconditions.checkState(
+ !columnarBatchSerializerFactoryList.isEmpty(),
+ "No factory found for Celeborn columnar batch serializer");
+ final Map<String, CelebornColumnarBatchSerializerFactory>
columnarBatchSerilizerFactoryMap =
columnarBatchSerializerFactoryList.stream()
.collect(Collectors.toMap(CelebornColumnarBatchSerializerFactory::backendName,
f -> f));
- // for now, we ignore check since CH backend has not support this
feature yet.
- // if (!columanrBatchSerilizerFactoryMap.containsKey(backendName)) {
- // throw new UnsupportedOperationException(
- // "No Celeborn columnar batch serializer writer factory found
for backend " +
- // backendName);
- // }
- columnarBatchSerializerFactory =
columanrBatchSerilizerFactoryMap.get(backendName);
+ if (!columnarBatchSerilizerFactoryMap.containsKey(backendName)) {
Review Comment:
Not directly related to this PR's target.
I think using a map might be unnecessary. As we know, the implementation for
ColumnarBatchSerializerFactory is exclusive, meaning Velox's factory
implementation can only be loaded by the Velox backend, and CH's implementation
can only be loaded by the CH backend. Then, it seems we can just check whether
a concrete factory is loaded. There's no need to store the backend name in the
factory or check the factory's backend name to confirm it belongs to the
current backend.
This is just a preliminary thought. Am I missing something?
##########
gluten-celeborn/src/main/java/org/apache/spark/shuffle/gluten/celeborn/CelebornShuffleManager.java:
##########
@@ -95,21 +95,19 @@ public class CelebornShuffleManager
celebornColumnarBatchSerializerFactoriesLoader.iterator(),
CelebornColumnarBatchSerializerFactory.class))
.collect(Collectors.toList());
- // for now, we ignore check since CH backend has not support this
feature yet.
- // Preconditions.checkState(
- // !columnarBatchSerializerFactoryList.isEmpty(),
- // "No factory found for Celeborn columnar batch serializer");
- final Map<String, CelebornColumnarBatchSerializerFactory>
columanrBatchSerilizerFactoryMap =
+
+ Preconditions.checkState(
+ !columnarBatchSerializerFactoryList.isEmpty(),
+ "No factory found for Celeborn columnar batch serializer");
+ final Map<String, CelebornColumnarBatchSerializerFactory>
columnarBatchSerilizerFactoryMap =
Review Comment:
\*Serializer\*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]