Hi Everyone !! Im trying to get on premise GPU instance of Spark 3 running on my ubuntu box, and I am following: https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html#example-join-operation
Anyone with any insight into why a spark job isnt being ran on the GPU - appears to be all on the CPU, hadoop binary installed and appears to be functioning fine export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath) here is my setup on ubuntu20.10 ▶ nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce RTX 3090 Off | 00000000:21:00.0 On | N/A | | 0% 38C P8 19W / 370W | 478MiB / 24265MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ /opt/sparkRapidsPlugin ▶ ls cudf-0.18.1-cuda11.jar getGpusResources.sh rapids-4-spark_2.12-0.4.1.jar ▶ scalac --version Scala compiler version 2.13.0 -- Copyright 2002-2019, LAMP/EPFL and Lightbend, Inc. ▶ spark-shell --version 2021-04-09 17:05:36,158 WARN util.Utils: Your hostname, studio resolves to a loopback address: 127.0.1.1; using 192.168.0.221 instead (on interface wlp71s0) 2021-04-09 17:05:36,159 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/jars/spark-unsafe_2.12-3.1.1.jar) to constructor java.nio.DirectByteBuffer(long,int) WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.1 /_/ Using Scala version 2.12.10, OpenJDK 64-Bit Server VM, 11.0.10 Branch HEAD Compiled by user ubuntu on 2021-02-22T01:04:02Z Revision 1d550c4e90275ab418b9161925049239227f3dc9 Url https://github.com/apache/spark Type --help for more information. here is how I calling spark prior to adding the test job $SPARK_HOME/bin/spark-shell \ --master local \ --num-executors 1 \ --conf spark.executor.cores=16 \ --conf spark.rapids.sql.concurrentGpuTasks=1 \ --driver-memory 10g \ --conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR} --conf spark.rapids.memory.pinnedPool.size=16G \ --conf spark.locality.wait=0s \ --conf spark.sql.files.maxPartitionBytes=512m \ --conf spark.sql.shuffle.partitions=10 \ --conf spark.plugins=com.nvidia.spark.SQLPlugin \ --files $SPARK_RAPIDS_DIR/getGpusResources.sh \ --jars ${SPARK_CUDF_JAR},${SPARK_RAPIDS_PLUGIN_JAR} Test job is from the example join-operation val df = sc.makeRDD(1 to 10000000, 6).toDF val df2 = sc.makeRDD(1 to 10000000, 6).toDF df.select( $"value" as "a").join(df2.select($"value" as "b"), $"a" === $"b").count I just noticed that the scala versions are out of sync - that shouldnt affect it? is there anything else I can try in the --conf or is there any logs to see what might be failing behind the scenes, any suggestions? Thanks Martin -- M