paleolimbot commented on issue #33094:
URL: https://github.com/apache/arrow/issues/33094#issuecomment-1387452294
I've done some bisecting of the tests in the pursuit of a minimal reproducer
here. Since it appears that the docker image used in the nightly test is the
only way to reproduce this, I looked up some image details and what gets run.
- The image is defined in arrow/docker-compose.yml ("ubuntu-r-valgrind")
- It's based on Winston Chang's r-debug image
- The image might have some ubuntu version mismatch thing going on...some of
the options in seems to suggest it's an 18.04 image but I'm pretty sure it's
22.04 that's running.
- The script that runs is in ci/scripts/r_valgrind.sh. It basically runs
r/tests/testthat.R with R -d valgrind.
```
ubuntu-r-valgrind:
# Only 18.04 and amd64 supported
# Usage:
# docker-compose build ubuntu-r-valgrind
# docker-compose run ubuntu-r-valgrind
image: ${REPO}:amd64-ubuntu-18.04-r-valgrind
build:
context: .
dockerfile: ci/docker/linux-r.dockerfile
cache_from:
- ${REPO}:amd64-ubuntu-18.04-r-valgrind
args:
base: wch1/r-debug:latest
r_bin: RDvalgrind
tz: ${TZ}
environment:
<<: [*ccache, *sccache]
ARROW_R_DEV: ${ARROW_R_DEV}
# AVX512 not supported by Valgrind (similar to ARROW-9851) some
runners support AVX512 and some do not
# so some build might pass without this setting, but we want to ensure
that we stay to AVX2 regardless of runner.
EXTRA_CMAKE_FLAGS: "-DARROW_RUNTIME_SIMD_LEVEL=AVX2"
ARROW_SOURCE_HOME: "/arrow"
volumes: *ubuntu-volumes
command: >
/bin/bash -c "
/arrow/ci/scripts/r_valgrind.sh /arrow"
```
To find a test file with a a leak, I modified `r/test/testthat.R` with a
filter to use specific tests:
```r
# Tried:
# filter = "^Array" (no leaks)
# filter = "^dataset" (no leaks)
# filter = "^dplyr" (leaks!)
# filter = "^dplyr-[g-u]" (leaks!)
# filter = "^dplyr-[s-u]" (leaks!)
# filter = "^dplyr-summarize" (leaks!)
test_check("arrow", reporter = arrow_reporter, filter = "^dplyr-summarize")
```
Next, I'll see if I can isolate one test in the summarize tests that leaks
consistently.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]