This is an automated email from the ASF dual-hosted git repository.

sunchao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion-comet.git


The following commit(s) were added to refs/heads/main by this push:
     new 9ab6c75  feat: Document the class path / classloader issue with the 
shuffle manager (#256)
9ab6c75 is described below

commit 9ab6c75f41456234f2fb93fcec15ff3cd435f49e
Author: Holden Karau <[email protected]>
AuthorDate: Sat Apr 13 09:16:34 2024 -0700

    feat: Document the class path / classloader issue with the shuffle manager 
(#256)
---
 README.md                                                        | 8 ++++++++
 .../apache/spark/shuffle/sort/CometShuffleExternalSorter.java    | 9 ++++++++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 3b903b1..121972c 100644
--- a/README.md
+++ b/README.md
@@ -127,6 +127,14 @@ Comet shuffle feature is disabled by default. To enable 
it, please add related c
 Above configs enable Comet native shuffle which only supports hash partiting 
and single partition.
 Comet native shuffle doesn't support complext types yet.
 
+Comet doesn't have official release yet so currently the only way to test it 
is to build jar and include it in your Spark application. Depending on your 
deployment mode you may also need to set the driver & executor class path(s) to 
explicitly contain Comet otherwise Spark may use a different class-loader for 
the Comet components than its internal components which will then fail at 
runtime. For example:
+
+```
+--driver-class-path spark/target/comet-spark-spark3.4_2.12-0.1.0-SNAPSHOT.jar
+```
+
+Some cluster managers may require additional configuration, see 
https://spark.apache.org/docs/latest/cluster-overview.html
+
 To enable columnar shuffle which supports all partitioning and basic complex 
types, one more config is required:
 ```
 --conf spark.comet.columnar.shuffle.enabled=true
diff --git 
a/spark/src/main/java/org/apache/spark/shuffle/sort/CometShuffleExternalSorter.java
 
b/spark/src/main/java/org/apache/spark/shuffle/sort/CometShuffleExternalSorter.java
index 9fe88ec..4417c4f 100644
--- 
a/spark/src/main/java/org/apache/spark/shuffle/sort/CometShuffleExternalSorter.java
+++ 
b/spark/src/main/java/org/apache/spark/shuffle/sort/CometShuffleExternalSorter.java
@@ -431,7 +431,14 @@ public final class CometShuffleExternalSorter implements 
CometShuffleChecksumSup
       // As we cannot access the address of the internal array in the sorter, 
so we need to
       // allocate the array manually and expand the pointer array in the 
sorter.
       // We don't want in-memory sorter to allocate memory but the initial 
size cannot be zero.
-      this.inMemSorter = new ShuffleInMemorySorter(allocator, 1, true);
+      try {
+        this.inMemSorter = new ShuffleInMemorySorter(allocator, 1, true);
+      } catch (java.lang.IllegalAccessError e) {
+        throw new java.lang.RuntimeException(
+            "Error loading in-memory sorter check class path -- see "
+                + 
"https://github.com/apache/arrow-datafusion-comet?tab=readme-ov-file#enable-comet-shuffle";,
+            e);
+      }
       sorterArray = allocator.allocateArray(initialSize);
       this.inMemSorter.expandPointerArray(sorterArray);
 

Reply via email to