[GitHub] [spark] zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

GitBox Sun, 26 Jan 2020 19:45:18 -0800

zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat
to >= 2.0.0
URL: https://github.com/apache/spark/pull/27359#issuecomment-578581851

> Please note that I'm supporting your effort on this PR. Otherwise, I'll
not chim in here to add comments.

Thank you, I appreciate that.

In general, full reproducible is defined by the Dockerfile which is shown at
the begging, but to put it here for reference

```
FROM rocker/verse:3.4.3
RUN apt-get update \
&& apt-get install -y --no-install-recommends gpg openjdk-8-jdk-headless
\
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*ce

RUN wget -qO- https://keybase.io/zero323/pgp_keys.asc | gpg --import
RUN git clone --depth 1 --branch SPARK-23435
https://github.com/zero323/spark.git
WORKDIR spark
RUN git rev-parse HEAD
RUN git verify-commit -v HEAD
RUN build/mvn -DskipTests -Phive -Psparkr clean package
RUN R --version
RUN R -e "install.packages(c('e1071', 'praise'))"
RUN R -e "install.packages('testthat',
repos='https://cloud.r-project.org/'); packageVersion('testthat');
sessionInfo()"
RUN R/create-rd.sh
RUN R/create-docs.sh
RUN R/check-cran.sh
RUN R/run-tests.sh
```
It can be re-run to confirm that it reflects current state of things.

As show in the cast, build are done directly from this head of this branch
(signature is verified) and no changes to the codebase, beyond what is proposed
in this PR (and we don't touch any Arrow related components here at all).

As of skipping Arrow tests - that's default behavior defined in respective
test for example here

https://github.com/apache/spark/blob/43d9c7e7e57749ee611e0c97781a71a0645b5e9b/R/pkg/tests/fulltests/test_sparkSQL_arrow.R#L25

and following lines. So it is neither failure or result of any source
modification.

Can we make arrow tests run? Possibly, but:

- R Arrow package is not present in snapshot repositories used by rocker
images. Installing testthat from https://cloud.r-project.org, already pushed
things a lot. Additionally some transitive dependencies have hidden version
bounds.
- C++ Arrow bindings would require external system repositories, which can
break decencies for R.
- Using other images (let's say official R-base) is not an option, as we
need Tex as well as OpenSSL and Curl dev libraries and this will either break
or require update of R beyond 2.4 (at least it did for other build
configurations I considered).

At this point Spark has no coverage for any intermediate R version (Jenkins
runs 3.1 and then we have almost eight years of releases worth gap to 3.6 on
AppVeyor), not to mention version-OS combinations. That's troubling, and as
work related to this PR shown, can miss obvious errors. but not something
that can be really addressed by running ad-hoc tests outside project
infrastructure.

Anyway... If you have specific concerns about the process used here, and you
suspect that proposed changes can lead to problems in the future, I'll do my
best to address these.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zero323 commented on issue #27359: [SPARK-23435][SPARKR][TESTS] Update testthat to >= 2.0.0

Reply via email to