avantgardnerio commented on code in PR #249:
URL: https://github.com/apache/arrow-ballista/pull/249#discussion_r974868367
##########
benchmarks/tpch-gen.sh:
##########
@@ -21,14 +21,13 @@
pushd ..
. ./dev/build-set-env.sh
popd
-docker build -t ballista-tpchgen:$BALLISTA_VERSION -f tpchgen.dockerfile .
# Generate data into the ./data directory if it does not already exist
FILE=./data/supplier.tbl
if test -f "$FILE"; then
echo "$FILE exists."
else
mkdir data 2>/dev/null
- docker run -v `pwd`/data:/data -it --rm ballista-tpchgen:$BALLISTA_VERSION
+ docker run -v `pwd`/data:/data -it --rm
ghcr.io/databloom-ai/tpch-docker:main -vf -s 1
Review Comment:
Use off-the-shelf data generator to go faster. No need to re-invent this
wheel.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]