andygrove edited a comment on pull request #263:
URL: https://github.com/apache/arrow-datafusion/pull/263#issuecomment-841823709


   For the record, I am now able to run benchmarks against the 100GB data set 
on my Pi Kubernetes cluster:
   
   ```yaml
   apiVersion: v1
   kind: Pod
   metadata:
     name: tpch
     namespace: default
   spec:
     containers:
       - image: andygrove/ballista-arm64
         command: [ "/tpch",
                    "benchmark",
                    "--query=1",
                    "--path=/mnt/tpch/parquet-sf100-partitioned/",
                    "--format=parquet",
                    "--concurrency=24",
                    "--iterations=1",
                    "--debug",
                    "--host=ballista-scheduler",
                    "--port=50050"]
         imagePullPolicy: Always
         name: tpch
         volumeMounts:
             - mountPath: /mnt/tpch/parquet-sf100-partitioned/
               name: data
     restartPolicy: Never
     volumes:
       - name: data
         persistentVolumeClaim:
           claimName: data-pv-claim
   ```
   
   ```
   $ microk8s.kubectl get pods
   NAME                   READY   STATUS      RESTARTS   AGE
   ballista-scheduler-0   1/1     Running     1          17h
   ballista-executor-5    1/1     Running     1          16h
   ballista-executor-2    1/1     Running     1          16h
   ballista-executor-3    1/1     Running     1          16h
   ballista-executor-0    1/1     Running     5          17h
   ballista-executor-4    1/1     Running     2          16h
   ballista-executor-1    1/1     Running     1          17h
   tpch                   0/1     Completed   0          13m
   ```
   
   ## Results
   
   ```
   
+--------------+--------------+------------+--------------------+--------------------+-------------------+--------------------+--------------------+----------------------+-------------+
   | l_returnflag | l_linestatus | sum_qty    | sum_base_price     | 
sum_disc_price     | sum_charge        | avg_qty            | avg_price         
 | avg_disc             | count_order |
   
+--------------+--------------+------------+--------------------+--------------------+-------------------+--------------------+--------------------+----------------------+-------------+
   | A            | F            | 3775127758 | 5660776097194.455  | 
5377736398183.935  | 5592847429515.929 | 25.499370423275426 | 38236.11698430493 
 | 0.050002243530928955 | 148047881   |
   | N            | F            | 98553062   | 147771098385.98013 | 
140384965965.03497 | 145999793032.7757 | 25.501556956882876 | 38237.19938880454 
 | 0.049985284338054006 | 3864590     |
   | N            | O            | 7436302959 | 11150725648169.863 | 
10593195276359.283 | 11016932215670.58 | 25.500009433521782 | 
38237.227663621445 | 0.0499979183499098   | 291619616   |
   | R            | F            | 3775724970 | 5661603032745.348  | 
5378513563915.405  | 5593662252666.916 | 25.50006628406532  | 
38236.697258453016 | 0.05000130433965413  | 148067261   |
   
+--------------+--------------+------------+--------------------+--------------------+-------------------+--------------------+--------------------+----------------------+-------------+
   Query 1 avg time: 104835.61 ms
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to