PHILO-HE commented on code in PR #5247:
URL: https://github.com/apache/incubator-gluten/pull/5247#discussion_r1549215779
##########
.github/workflows/velox_docker.yml:
##########
@@ -284,6 +284,52 @@ jobs:
# -d=OFFHEAP_SIZE:2g,spark.memory.offHeap.size=2g \
# -d=OFFHEAP_SIZE:1g,spark.memory.offHeap.size=1g || true
+ run-tpc-test-ubuntu-2204-celeborn:
+ needs: build-native-lib
+ strategy:
+ fail-fast: false
+ matrix:
+ spark: ["spark-3.2"]
+ celeborn: ["celeborn-0.4.0", "celeborn-0.3.2"]
+ runs-on: ubuntu-20.04
+ container: ubuntu:22.04
+ steps:
+ - uses: actions/checkout@v2
+ - name: Download All Artifacts
+ uses: actions/download-artifact@v2
+ with:
+ name: velox-native-lib-${{github.sha}}
+ path: ./cpp/build/releases
+ - name: Setup java and maven
+ run: |
+ apt-get update && apt-get install -y openjdk-8-jdk maven wget
+ - name: Build for Spark ${{ matrix.spark }}
+ run: |
+ cd $GITHUB_WORKSPACE/
+ mvn clean install -P${{ matrix.spark }} -Pbackends-velox,rss
-DskipTests
+ - name: TPC-H SF-0.1 && TPC-DS SF-0.1 Parquet local spark3.2 with ${{
matrix.celeborn }}
+ run: |
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ EXTRA_PROFILE=""
+ if [ "${{ matrix.celeborn }}" = "celeborn-0.4.0" ]; then
+ EXTRA_PROFILE="-Pceleborn-0.4"
+ fi
+ echo "EXTRA_PROFILE: ${EXTRA_PROFILE}"
+ cd /opt && mkdir -p celeborn && \
+ wget https://archive.apache.org/dist/incubator/celeborn/${{
matrix.celeborn }}-incubating/apache-${{ matrix.celeborn }}-incubating-bin.tgz
&& \
+ tar xzf apache-${{ matrix.celeborn }}-incubating-bin.tgz -C
/opt/celeborn --strip-components=1 && cd celeborn && \
+ mv ./conf/celeborn-env.sh.template ./conf/celeborn-env.sh && \
+ bash -c "echo -e
'CELEBORN_MASTER_MEMORY=4g\nCELEBORN_WORKER_MEMORY=4g\nCELEBORN_WORKER_OFFHEAP_MEMORY=8g'
> ./conf/celeborn-env.sh" && \
+ bash -c "echo -e 'celeborn.worker.commitFiles.threads
128\nceleborn.worker.sortPartition.threads 64' > ./conf/celeborn-defaults.conf"
&& \
+ bash ./sbin/start-master.sh && bash ./sbin/start-worker.sh && \
+ cd $GITHUB_WORKSPACE/tools/gluten-it && mvn clean install
-Pspark-3.2,rss ${EXTRA_PROFILE} && \
+ GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
+ --local --preset=velox-with-celeborn --benchmark-type=h
--error-on-memleak --off-heap-size=10g -s=0.1 --threads=8 --iterations=1 && \
+ GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \
+ --local --preset=velox-with-celeborn --benchmark-type=ds
--error-on-memleak --off-heap-size=10g -s=0.1 --threads=8 --iterations=1 && \
+ bash /opt/celeborn/sbin/stop-worker.sh && \
+ bash /opt/celeborn/sbin/stop-master.sh && rm -rf /opt/celeborn
Review Comment:
@kerwin-zk, they are executed in different containers. So I believe there is
no conflict issue.
It's still strange that stopping command needs long wait time, but all
queries have been executed successfully. Even though a smaller SF (0.1) is
used, this issue still exists. I cannot reproduce it in my local docker
container.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]