zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat 
to >= 2.0.0 
URL: https://github.com/apache/spark/pull/27359#issuecomment-578581851
 
 
   > Please note that I'm supporting your effort on this PR. Otherwise, I'll 
not chim in here to add comments.
   
   Thank you, I appreciate that. 
   
   In general, full reproducible is defined by the Dockerfile which is shown at 
the begging,  but to put it here for reference 
   
   ```
   FROM rocker/verse:3.4.3
   RUN apt-get update \
       && apt-get install -y --no-install-recommends gpg openjdk-8-jdk-headless 
\
       && apt-get clean \
       && rm -rf /var/lib/apt/lists/*ce
   
   RUN wget -qO- https://keybase.io/zero323/pgp_keys.asc | gpg --import
   RUN git clone --depth 1 --branch SPARK-23435 
https://github.com/zero323/spark.git
   WORKDIR spark
   RUN git rev-parse HEAD
   RUN git verify-commit -v HEAD
   RUN build/mvn -DskipTests -Phive -Psparkr clean package
   RUN R --version
   RUN R -e "install.packages(c('e1071', 'praise'))"
   RUN R -e "install.packages('testthat', 
repos='https://cloud.r-project.org/'); packageVersion('testthat'); 
sessionInfo()"
   RUN R/create-rd.sh
   RUN R/create-docs.sh
   RUN R/check-cran.sh
   RUN R/run-tests.sh
   ```
   It can be re-run to confirm that it reflects current state of things.
   
   As show in the cast, build are done directly from this head of this branch 
(signature is verified) and no changes to the codebase, beyond what is proposed 
in this PR (and we don't touch any Arrow related components here at all).
   
   As of skipping Arrow tests - that's default behavior defined in respective 
test for example here
   
   
https://github.com/apache/spark/blob/43d9c7e7e57749ee611e0c97781a71a0645b5e9b/R/pkg/tests/fulltests/test_sparkSQL_arrow.R#L25
   
   and following lines. So it is neither failure or result of any source 
modification. 
   
   Can we make arrow tests run? Possibly, but:
   
   - R Arrow package is not present in snapshot repositories used by rocker 
images. Installing testthat from https://cloud.r-project.org, already pushed 
things a lot. Additionally some transitive dependencies have hidden version 
bounds.
   - C++ Arrow bindings would require external system repositories, which can 
break decencies for R.
   - Using other images (let's say official R-base) is not an option, as we 
need Tex as well as OpenSSL and Curl dev libraries and this will either break 
or require update of R beyond 2.4 (at least it did for other build 
configurations I considered).
   
   At this point Spark has no coverage for any intermediate R version (Jenkins 
runs 3.1 and then we have almost eight years of releases worth gap to 3.6 on 
AppVeyor), not to mention version-OS combinations. That's troubling, and as 
work related to this PR shown, can miss obvious errors.  but  not something 
that can be really addressed by running ad-hoc tests outside project 
infrastructure.
   
   Anyway... If you have specific concerns about the process used here, and you 
suspect that proposed changes can lead to problems in the future, I'll do my 
best to address these.
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to