This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 9d7373afaff [SPARK-42230][INFRA] Improve `lint` job by skipping 
PySpark and SparkR docs if unchanged
9d7373afaff is described below

commit 9d7373afaff378f7ffbfe4cf30d9f2f89ce65e53
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Sun Jan 29 22:15:52 2023 -0800

    [SPARK-42230][INFRA] Improve `lint` job by skipping PySpark and SparkR docs 
if unchanged
    
    ### What changes were proposed in this pull request?
    
    This PR aims to improve `GitHub Action lint` job by skipping `PySpark` and 
`SparkR` documentation generations if PySpark and R module is unchanged.
    
    ### Why are the changes needed?
    
    `Documentation Generation` took over 53 minutes because it generates all 
Java/Scala/SQL/PySpark/R documentation always. However, `R` module is not 
changed frequently so that the documentation is always identical. `PySpark` 
module is more frequently changed, but still we can skip in many cases. This PR 
shows the reduction from `53` minutes to `18` minutes.
    
    **BEFORE**
    ![Screenshot 2023-01-29 at 4 36 07 
PM](https://user-images.githubusercontent.com/9700541/215365573-bf83717b-cd9e-46e2-912f-5c9d2f359b08.png)
    
    **AFTER**
    ![Screenshot 2023-01-29 at 10 13 27 
PM](https://user-images.githubusercontent.com/9700541/215401795-3f810e52-2fe3-44fd-99f4-b5750964c3b6.png)
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, this is an infra only change.
    
    ### How was this patch tested?
    
    Manual review.
    
    Closes #39792 from dongjoon-hyun/SPARK-42230.
    
    Authored-by: Dongjoon Hyun <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
    (cherry picked from commit 1d3c2681d26bf6034d15ee261e5395e9f45d67f8)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 .github/workflows/build_and_test.yml | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 40798e099be..05bdce2a0c2 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -559,6 +559,12 @@ jobs:
       run: ./dev/test-dependencies.sh
     - name: Run documentation build
       run: |
+        if [ -f "./dev/is-changed.py" ]; then
+          # Skip PySpark and SparkR docs while keeping Scala/Java/SQL docs
+          pyspark_modules=`cd dev && python3.9 -c "import 
sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if 
m.name.startswith('pyspark')))"`
+          if [ `./dev/is-changed.py -m $pyspark_modules` = false ]; then 
export SKIP_PYTHONDOC=1; fi
+          if [ `./dev/is-changed.py -m sparkr` = false ]; then export 
SKIP_RDOC=1; fi
+        fi
         cd docs
         bundle exec jekyll build
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to