avantgardnerio commented on code in PR #249:
URL: https://github.com/apache/arrow-ballista/pull/249#discussion_r974868367


##########
benchmarks/tpch-gen.sh:
##########
@@ -21,14 +21,13 @@
 pushd ..
 . ./dev/build-set-env.sh
 popd
-docker build -t ballista-tpchgen:$BALLISTA_VERSION -f tpchgen.dockerfile .
 
 # Generate data into the ./data directory if it does not already exist
 FILE=./data/supplier.tbl
 if test -f "$FILE"; then
     echo "$FILE exists."
 else
   mkdir data 2>/dev/null
-  docker run -v `pwd`/data:/data -it --rm ballista-tpchgen:$BALLISTA_VERSION
+  docker run -v `pwd`/data:/data -it --rm 
ghcr.io/databloom-ai/tpch-docker:main -vf -s 1

Review Comment:
   Use off-the-shelf data generator to go faster. No need to re-invent this 
wheel.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to