jonkeane commented on a change in pull request #10930:
URL: https://github.com/apache/arrow/pull/10930#discussion_r703060253
##########
File path: r/vignettes/developing.Rmd
##########
@@ -280,193 +258,140 @@ cmake \
..
```
</p>
-</details>
+</details>
-### Documentation
+## Installing a version of the R package with a specific git reference
-The documentation for the R package uses features of `roxygen2` that haven't
yet been released on CRAN, such as conditional inclusion of examples via the
`@examplesIf` tag. If you are making changes which require updating the
documentation, please install the development version of `roxygen2` from GitHub.
+If you need an arrow installation from a specific repository or git reference,
on most platforms except Windows, you can run:
```{r}
-remotes::install_github("r-lib/roxygen2")
-```
-
-## Troubleshooting
-
-Note that after any change to the C++ library, you must reinstall it and
-run `make clean` or `git clean -fdx .` to remove any cached object code
-in the `r/src/` directory before reinstalling the R package. This is
-only necessary if you make changes to the C++ library source; you do not
-need to manually purge object files if you are only editing R or C++
-code inside `r/`.
-
-### Arrow library-R package mismatches
-
-If the Arrow library and the R package have diverged, you will see errors like:
-
-```
-Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath
= DLLpath, ...):
- unable to load shared object
'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so':
-
dlopen(/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so,
6): Symbol not found:
__ZN5arrow2io16RandomAccessFile9ReadAsyncERKNS0_9IOContextExx
- Referenced from:
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so
- Expected in: flat namespace
- in
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so
-Error: loading failed
-Execution halted
-ERROR: loading failed
+remotes::install_github("apache/arrow/r", build = FALSE)
```
-To resolve this, try rebuilding the Arrow library from [Building Arrow
above](#building-arrow).
+The `build = FALSE` argument is important so that the installation can access
the
+C++ source in the `cpp/` directory in `apache/arrow`.
-### Multiple versions of Arrow library
+As with other installation methods, setting the environment variables
`LIBARROW_MINIMAL=false` and `ARROW_R_DEV=true` will provide a more
full-featured version of Arrow and provide more verbose output, respectively.
-If rebuilding the Arrow library doesn't work and you are [installing from a
user-level directory](#installing-to-another-directory) and you already have a
previous installation of libarrow in a system directory or you get you may get
errors like the following when you install the R package:
+For example, to install from the (fictional) branch `bugfix` from
`apache/arrow` you could run:
-```
-Error: package or namespace load failed for ‘arrow’ in dyn.load(file, DLLpath
= DLLpath, ...):
- unable to load shared object
'/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so':
-
dlopen(/Library/Frameworks/R.framework/Versions/4.0/Resources/library/00LOCK-r/00new/arrow/libs/arrow.so,
6): Library not loaded: /usr/local/lib/libarrow.400.dylib
- Referenced from: /usr/local/lib/libparquet.400.dylib
- Reason: image not found
+```r
+Sys.setenv(LIBARROW_MINIMAL="false")
+remotes::install_github("apache/arrow/r@bugfix", build = FALSE)
```
-You need to make sure that you don't let R link to your system library when
building arrow. You can do this a number of different ways:
+Developers may wish to use this method of installing a specific commit
+separate from another Arrow development environment or system installation
+(e.g. we use this in [arrowbench](https://github.com/ursacomputing/arrowbench)
+to install development versions of libarrow isolated from the system install).
If
+you already have libarrow installed system-wide, you may need to set
+some additional variables in order to isolate this build from your system
libraries:
-* Setting the `MAKEFLAGS` environment variable to `"LDFLAGS="` (see below for
an example) this is the recommended way to accomplish this
-* Using {withr}'s `with_makevars(list(LDFLAGS = ""), ...)`
-* adding `LDFLAGS=` to your `~/.R/Makevars` file (the least recommended way,
though it is a common debugging approach suggested online)
+* Setting the environment variable `FORCE_BUNDLED_BUILD` to `true` will skip
the `pkg-config` search for libarrow and attempt to build from the same source
at the repository+ref given.
-```{bash, save=run & !sys_install & macos, hide=TRUE}
-# Setup troubleshooting section
-# install a system-level arrow on macOS
-brew install apache-arrow
+* You may also need to set the Makevars `CPPFLAGS` and `LDFLAGS` to `""` in
order to prevent the installation process from attempting to link to already
installed system versions of libarrow. One way to do this temporarily is
wrapping your `remotes::install_github()` call like so:
+```{r}
+withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""),
remotes::install_github(...))
```
+# Common developer workflow tasks
-```{bash, save=run & !sys_install & ubuntu, hide=TRUE}
-# Setup troubleshooting section
-# install a system-level arrow on Ubuntu
-sudo apt update
-sudo apt install -y -V ca-certificates lsb-release wget
-wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr
'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename
--short).deb
-sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release
--codename --short).deb
-sudo apt update
-sudo apt install -y -V libarrow-dev
-```
+The `arrow/r` directory contains a `Makefile` to help with some common tasks
from the command line (e.g. `make test`, `make doc`, `make clean`, etc.).
-```{bash, save=run & !sys_install & macos}
-MAKEFLAGS="LDFLAGS=" R CMD INSTALL .
-```
+## Loading arrow
+You can load the R package via `devtools::load_all()`.
-### `rpath` issues
+## Rebuilding the documentation
-If the package fails to install/load with an error like this:
+The R documentation uses the
[`@examplesIf`](https://roxygen2.r-lib.org/articles/rd.html#functions) tag
introduced in `roxygen2` version 7.1.1.9001, which hasn't yet been released on
CRAN at the time of writing. If you are making changes which require updating
the documentation, please install the development version of `roxygen2` from
GitHub.
-```
- ** testing if installed package can be loaded from temporary location
- Error: package or namespace load failed for 'arrow' in dyn.load(file,
DLLpath = DLLpath, ...):
- unable to load shared object
'/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so':
- dlopen(/Users/you/R/00LOCK-r/00new/arrow/libs/arrow.so, 6): Library not
loaded: @rpath/libarrow.14.dylib
+```{r}
+remotes::install_github("r-lib/roxygen2")
```
-ensure that `-DARROW_INSTALL_NAME_RPATH=OFF` was passed (this is important on
-macOS to prevent problems at link time and is a no-op on other platforms).
-Alternatively, try setting the environment variable `R_LD_LIBRARY_PATH` to
-wherever Arrow C++ was put in `make install`, e.g. `export
-R_LD_LIBRARY_PATH=/usr/local/lib`, and retry installing the R package.
+You can use `devtools::document()` and `pkgdown::build_site()` to rebuild the
documentation and preview the results.
-When installing from source, if the R and C++ library versions do not
-match, installation may fail. If you’ve previously installed the
-libraries and want to upgrade the R package, you’ll need to update the
-Arrow C++ library first.
-
-For any other build/configuration challenges, see the [C++ developer
-guide](https://arrow.apache.org/docs/developers/cpp/building.html).
+```r
+# Update roxygen documentation
+devtools::document()
+# To preview the documentation website
+pkgdown::build_site(preview=TRUE)
+```
-## Using `remotes::install_github(...)`
+## Styling and linting
-If you need an Arrow installation from a specific repository or at a specific
ref,
-`remotes::install_github("apache/arrow/r", build = FALSE)`
-should work on most platforms (with the notable exception of Windows).
-The `build = FALSE` argument is important so that the installation can access
the
-C++ source in the `cpp/` directory in `apache/arrow`.
+### R code
-As with other installation methods, setting the environment variables
`LIBARROW_MINIMAL=false` and `ARROW_R_DEV=true` will provide a more
full-featured version of Arrow and provide more verbose output, respectively.
+The R code in the package follows [the tidyverse
style](https://style.tidyverse.org/). On PR submission (and on pushes) our CI
will run linting and will flag possible errors on the pull request with
annotations.
-For example, to install from the (fictional) branch `bugfix` from
`apache/arrow` one could:
+To run the [lintr](https://github.com/jimhester/lintr) locally, install the
lintr package (note, we currently use a fork that includes fixes not yet
accepted upstream, see how lintr is being installed in the file
`ci/docker/linux-apt-lint.dockerfile` for the current status) and then run
-```r
-Sys.setenv(LIBARROW_MINIMAL="false")
-remotes::install_github("apache/arrow/r@bugfix", build = FALSE)
+```{r}
+lintr::lint_package("arrow/r")
```
-Developers may wish to use this method of installing a specific commit
-separate from another Arrow development environment or system installation
-(e.g. we use this in [arrowbench](https://github.com/ursacomputing/arrowbench)
to install development versions of arrow isolated from the system install). If
you already have Arrow C++ libraries installed system-wide, you may need to set
some additional variables in order to isolate this build from your system
libraries:
-
-* Setting the environment variable `FORCE_BUNDLED_BUILD` to `true` will skip
the `pkg-config` search for Arrow libraries and attempt to build from the same
source at the repository+ref given.
-* You may also need to set the Makevars `CPPFLAGS` and `LDFLAGS` to `""` in
order to prevent the installation process from attempting to link to already
installed system versions of Arrow. One way to do this temporarily is wrapping
your `remotes::install_github()` call like so:
`withr::with_makevars(list(CPPFLAGS = "", LDFLAGS = ""),
remotes::install_github(...))`.
+You can automatically change the formatting of the code in the package using
the [styler](https://styler.r-lib.org/) package. There are two ways to do this:
-## What happens when you `R CMD INSTALL`?
+1. Use the comment bot to do this automatically with the command
`@github-actions autotune` on a PR, and commit it back to the branch.
-There are a number of scripts that are triggered when `R CMD INSTALL .`. For
Arrow users, these should all just work without configuration and pull in the
most complete pieces (e.g. official binaries that we host) so the installation
process is easy. However knowing about these scripts can help troubleshoot if
things go wrong in them or things go wrong in an install:
+2. Run the styler locally either via Makefile commands:
-* `configure` and `configure.win` These scripts are triggered during `R CMD
INSTALL .` on non-Windows and Windows platforms, respectively. They handle
finding the Arrow library, setting up the build variables necessary, and
writing the package Makevars file that is used to compile the C++ code in the R
package.
-* `tools/nixlibs.R` This script is sometimes called by `configure` on Linux
(or on any non-windows OS with the environment variable
`FORCE_BUNDLED_BUILD=true`). This sets up the build process for our bundled
builds (which is the default on linux). The operative logic is at the end of
the script, but it will do the following (and it will stop with the first one
that succeeds and some of the steps are only checked if they are enabled via an
environment variable):
- * Check if there is an already built libarrow in
`arrow/r/libarrow-{version}`, use that to link against if it exists.
- * Check if a binary is available from our hosted unofficial builds.
- * Download the Arrow source and build the Arrow Library from source.
- * `*** Proceed without C++` dependencies (this is an error and the package
will not work, but if you see this message you know the previous steps have not
succeeded/were not enabled)
-* `inst/build_arrow_static.sh` this script builds Arrow for a bundled, static
build. It is called by `tools/nixlibs.R` when the Arrow library is being built.
(If you're looking at this script, and you've gotten this far, it should look
_incredibly_ familiar: it's basically the contents of this guide in script form
— with a few important changes)
-
-## Styling and linting of the R code in the R package
-
-The R code in the package follows [the tidyverse
style](https://style.tidyverse.org/). On PR submission (and on pushes) our CI
will run linting and will flag possible errors on the pull request with
annotations.
-
-To run the [lintr](https://github.com/jimhester/lintr) locally, install the
lintr package (note, we currently use a fork that includes fixes not yet
accepted upstream, see how lintr is being installed in the file
`ci/docker/linux-apt-lint.dockerfile` for the current status) and then run
`lintr::lint_package("arrow/r")`.
+```bash
+make style # (for only the files changed)
+make style-all # (for all files)
+```
-One can automatically change the formatting of the code in the package using
the [styler](https://styler.r-lib.org/) package. There are two ways to do this:
+or in R:
-1. Use the comment bot to do this automatically with the command
`@github-actions autotune` on a PR and commit it back to the branch.
-2. Locally, with the command `make style` (for only the files changed), `make
style-all` (for all files), or use `styler::style_pkg(exclude_files =
c("tests/testthat/latin1.R", "data-raw/codegen.R"))` note the two excluded
files which should not be styled.
+```{r}
+# note the two excluded files which should not be styled
+styler::style_pkg(exclude_files = c("tests/testthat/latin1.R",
"data-raw/codegen.R"))
Review comment:
```suggestion
styler::style_pkg(exclude_files = c("tests/testthat/latin1.R",
"data-raw/codegen.R"))
```
To appease the lintr, a bit silly.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]