Le 27/01/2026 à 22:51, Duncan Murdoch a écrit :
My first reaction was that you shouldn't use the Introduction document as a reference, you should be using the Language Definition or the man pages.

The Language Definition gives an example of adding two vectors, and describes the result there.  It doesn't talk about recycling rules for more complex expressions.

The man page `?Arithmetic` gives a more complete description, also in terms of binary operations, not complex expressions.

So I think things are behaving as designed, and the Introduction document describes it ambiguously, but not incorrectly strictly speaking, since it doesn't say exactly how the recycling will occur. Maybe this would be a clearer description:

"Vectors occurring in the same expression need not all be of the same length.  If they are not, the value of the expression is a vector with the same length as the longest vector which occurs in the expression. Recycling occurs in each binary operation:  the shorter vector is recycled as often as need be (perhaps fractionally) until it matches the length of the longer vector."
This would eliminate ambiguity in the description but leave the "problem" as is. For me the problem is that actual behavior is easy to program but difficult to use in practice. I mean that for a programmer it is much easier to conceive a commun alignment for all components in an expression and a complexe action on it than to parse mentally an expression to follow what alignment occurs at which moment.

I understand that programming a commun alignment through a complex expression could lead to some sever reverse compatibility issues but it may be proposed a new option or something alike to give an opportunity to a user to relie on a common alignement in complex expressions.

Best,
Serguei.


Duncan Murdoch

On 2026-01-27 3:58 p.m., Poole, Geoffrey via R-devel wrote:
Synopsis:  In multistep expressions, e.g.:

fun <- function(a, b, c) (a + b) / c

`fun` returns an unexpected and non-intuative result when:
  - a, b, and c are vectors
  - c is the longest vector
  - the lengths of a, b, and c are not even multiples of one another.

In this case, because of the way vectors are being recycled:

fun(a, b, c)

returns a different result from:

mapply(fun, a, b, c)

Description:

The R documentation in "An Introduction to R" Section 2.2 states:

   "Vectors occurring in the same expression need not all be of the same length.    If they are not, the value of the expression is a vector with the same length    as the longest vector which occurs in the expression. Shorter vectors in the    expression are recycled as often as need be (perhaps fractionally) until they
   match the length of the longest vector."

Based on this documentation, I would expect that all vectors in an expression are recycled to match the length of the longest vector before element-wise operations are performed. However, R appears to perform recycling independently
at each operation, which produces different results than the documented
behavior would suggest.

Minimal reproducible example:

```r
# Simple function demonstrating the issue
f <- function(a, b, c) {
   (a + b) / c
}

# Vectors of different lengths (not multiples of each other)
a <- c(1, 2, 3, 4)
b <- c(10, 20, 30)
c <- c(100, 200, 300, 400, 500, 600, 700)

# Direct call
direct_result <- f(a, b, c)

# mapply (recycles all inputs to length 7 first, then applies element-wise)
mapply_result <- mapply(f, a, b, c)

# Compare results
cat("Direct call result:\n")
print(direct_result)

cat("\nmapply result:\n")
print(mapply_result)

cat("\nResults are identical:", identical(direct_result, mapply_result), "\n")

sessionInfo()
```

Output:

f <- function(a, b, c) {
+   (a + b) / c
+ }

a <- c(1, 2, 3, 4)
b <- c(10, 20, 30)
c <- c(100, 200, 300, 400, 500, 600, 700)

direct_result <- f(a, b, c)
Warning messages:
1: In a + b :
   longer object length is not a multiple of shorter object length
2: In (a + b)/c :
   longer object length is not a multiple of shorter object length

mapply_result <- mapply(f, a, b, c)
Warning messages:
1: In mapply(f, a, b, c) :
   longer argument not a multiple of length of shorter
2: In mapply(f, a, b, c) :
   longer argument not a multiple of length of shorter

print(direct_result)
[1] 0.11000000 0.11000000 0.11000000 0.03500000 0.02200000 0.03666667 0.04714286

print(mapply_result)
[1] 0.11000000 0.11000000 0.11000000 0.03500000 0.04200000 0.05333333 0.01857143

cat("\nResults are identical:", identical(direct_result, mapply_result), "\n")
Results are identical: FALSE

sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 22.2

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/Denver
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods base

loaded via a namespace (and not attached):
[1] compiler_4.3.3 tools_4.3.3

Explanation of what is happening:

In the direct call, recycling occurs independently at each binary operation:

1. `a + b` is evaluated first: `a` (length 4) and `b` (length 3) are recycled
    to length 4, producing `[11, 22, 33, 14]`

2. The length-4 result is then divided by `c` (length 7): the length-4 result     is recycled to length 7 as `[11, 22, 33, 14, 11, 22, 33]`, then divided by `c`

3. Final result: `[0.11, 0.11, 0.11, 0.035, 0.022, 0.0367, 0.0471]`

However, based on my reading of the documentation, I would expect all three vectors (`a`, `b`, and `c`) to be recycled to length 7 (the length of the longest vector in the expression) before any operations are performed, which
is what `mapply` does:

- `a` recycled to length 7: `[1, 2, 3, 4, 1, 2, 3]`
- `b` recycled to length 7: `[10, 20, 30, 10, 20, 30, 10]`
- `c` unchanged: `[100, 200, 300, 400, 500, 600, 700]`
- Then `(a + b) / c` computed element-wise: `[0.11, 0.11, 0.11, 0.035, 0.042, 0.0533, 0.0186]`

The key difference is at positions 5, 6, and 7. In the direct call, the
intermediate result `(a + b)` has length 4 and is recycled independently of
the original vectors when dividing by `c`.

Question:

Does this example represent a bug in R's recycling behavior, or is the
documentation in Section 2.2 not intended to describe how recycling works in
expressions with multiple binary operations? If the current behavior is
intentional, could the documentation be clarified to explain that recycling occurs at each binary operation rather than globally across the expression?


    [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: [email protected]
https://www.toulouse-biotechnology-institute.fr/en/plateformes-plateaux/cellule-mathematiques/

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to