Noah-FetchRewards opened a new issue, #1411: URL: https://github.com/apache/datafusion-comet/issues/1411
### Describe the bug I am trying to get a pyspark job to run on apache comet using an EKS cluster; however after 20+ hours, I am unable to do so for a variety of reasons. Explanation: I tried to follow the example used in the benchmark https://github.com/apache/datafusion-comet/tree/main/benchmarks, where I built the image locally, got it uploaded to ECR, then tried to use the spark submit command to run the tpcbench.py (after updating the .jar files to point to the latest version); however, the sample data to run the data at, conf spark.kubernetes.executor.volumes.hostPath.tpcdata.options.path=/mnt/bigdata/tpcds/sf100/ is obviously not in the image and I gave up pretty quickly, as I am unsure how to get the data loaded into the container itself. I then tried to run the pyspark files that were already locally installed into the base datafusion-comet image, specifically: local:///opt/spark/examples/src/main/python/pi.py However, I would just see the message: 25/02/16 17:42:12 INFO LoggingPodStatusWatcherImpl: Application status for spark-4a588861176e45218dc96d608820b902 (phase: Running) Run indefinitely, and it never seems to actually complete, so I can't seem to confirm if pyspark jobs actually work on Apache comet. Using the command: ` spark-submit \ --master $SPARK_MASTER \ --deploy-mode cluster \ --name comet-tpcbench \ --driver-memory 8G \ --conf spark.driver.memory=8G \ --conf spark.executor.instances=1 \ --conf spark.executor.memory=32G \ --conf spark.executor.cores=8 \ --conf spark.cores.max=8 \ --conf spark.task.cpus=1 \ --conf spark.executor.memoryOverhead=3G \ --jars local://$COMET_JAR \ --conf spark.executor.extraClassPath=$COMET_JAR \ --conf spark.driver.extraClassPath=$COMET_JAR \ --conf spark.plugins=org.apache.spark.CometPlugin \ --conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions \ --conf spark.comet.enabled=true \ --conf spark.comet.exec.enabled=true \ --conf spark.comet.exec.all.enabled=true \ --conf spark.comet.cast.allowIncompatible=true \ --conf spark.comet.exec.shuffle.enabled=true \ --conf spark.comet.exec.shuffle.mode=auto \ --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \ --conf spark.kubernetes.namespace=default \ --conf spark.kubernetes.driver.pod.name=tpcbench \ --conf spark.kubernetes.container.image=$COMET_DOCKER_IMAGE \ local:///opt/spark/examples/src/main/python/pi.py ` I also tried to run the examples on https://datafusion.apache.org/comet/user-guide/kubernetes.html using the spark-operator (after fixing the outdated jars again), but the job seems to run for a while and not complete or give me logs. I'm not particularly experienced with spark on kuberentes, but how exactly am I supposed to run a job to completion using apache comet at all? I even bothered to install microk8s, and tried to follow the examples to run it there, to no avail. I would also appreciate links to useful reference materials to getting started doing this sort of work. I'd like to not I also spent 2 days trying to get apache ballista to work with the existing examples, only to also run into a plethora of strange bugs as well, I was hoping Comet would be alot easier to be productive on. ### Steps to reproduce _No response_ ### Expected behavior _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org