HyukjinKwon commented on a change in pull request #32015:
URL: https://github.com/apache/spark/pull/32015#discussion_r606200789



##########
File path: .github/workflows/benchmark.yml
##########
@@ -0,0 +1,99 @@
+name: Run benchmarks
+
+on:
+  workflow_dispatch:
+    inputs:
+      class:
+        description: 'Benchmark class'
+        required: true
+        default: '*'
+      jdk:
+        description: 'JDK version: 8 or 11'
+        required: true
+        default: '8'
+      failfast:
+        description: 'Failfast: true or false'
+        required: true
+        default: 'true'
+      num-splits:
+        description: 'Number of job splits'
+        required: true
+        default: '1'

Review comment:
       I had to add this parameter because GitHub Actions' limits job's timeout 
as 6 hours (workflow is 72 hours), and sequential running of benchmarks takes 
up to 50 hours. In this way, it runs the benchmarks in parallel so I think it's 
okay .. although it might expose too many parameters to control.
   
   For example, I am now running all benchmarks in 20 splits (with JDK 11) at 
[here](https://github.com/HyukjinKwon/spark/actions/runs/711543792):
   
   ![Screen Shot 2021-04-02 at 8 42 31 
PM](https://user-images.githubusercontent.com/6477701/113412739-04fe2580-93f4-11eb-9147-2d602b0ee987.png)
   
   which results in 20 jobs that runs benchmarks in parallel (hashed by 20)
   
   ![Screen Shot 2021-04-02 at 8 42 43 
PM](https://user-images.githubusercontent.com/6477701/113412744-0891ac80-93f4-11eb-8973-31248ecd6f2d.png)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to