Hi Bill,

On 06/17/2015 12:36 PM, William Dunlap wrote:
if '+' and paste don't change their behavior with respect to
factors but you encourage people to use '+' instead of paste
then you will run into problems with data.frame columns because
many people don't notice whether a character-like column is
character or factor.  With paste() this is not a problem but with '+'
it is.  I think it is good not to make people worry about this much.

As for the recycling issue, consider calls involving NULL arguments,
   > f <- function(n)paste0(n, " test", if(n!=1)"s", " failed")
   > f(1)
   [1] "1 test failed"
   > f(0)
   [1] "0 tests failed"
If paste0 followed the same recycling rules as "+" then f(1) would return
character(0).  There is a fair bit of code like that on CRAN.

OTOH a very common use case is to use paste (or paste0) to add a given
prefix (or suffix) to a bunch of strings:

  paste0("ID", x)  # buggy! (won't do the right thing if length(x) is 0)

This is like "adding" something to 'x' so it's conceptually no different
from doing:

  x + 5

which does the right thing when 'x' is a numeric(0).

Anyway, I don't think anybody suggested to change the recycling rules
of paste() or paste0() (which would of course break some existing code
that relies on it, but that's a very generic statement right?), only
to adopt the recycling rules of `+` and other binary arithmetic and
comparison operators if `+` was used to concatenate strings.

Cheers,
H.


Consider using sprintf() to get the sort of recycling rules that "+" uses
   > sprintf("%s is %d", c("One","Two"), numeric(0))
   character(0)
   > sprintf("%s is %d", c("One","Two"), 17)
   [1] "One is 17" "Two is 17"
   > sprintf("%s is %d", c("One","Two"), 26:27)
   [1] "One is 26" "Two is 27"



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Jun 17, 2015 at 9:56 AM, Gábor Csárdi <csardi.ga...@gmail.com>
wrote:

On Wed, Jun 17, 2015 at 12:45 PM, William Dunlap <wdun...@tibco.com>
wrote:
... adding the ability to concat
strings with '+' would be a relatively simple addition (no pun intended)
to
the code base I believe. With a lot of other languages supporting this
kind
of concatenation, this is what surprised me most when first learning R.

Wow!  R has a lot of surprising features and I would have thought
this would be quite a way down the list.

Well, it is hard to guess what users and people in general find
surprising. As '+' is used for string concatenation in essentially all
major scripting (and many other) languages, personally I am not
surprised that this is surprising for people. :)

How would this new '+' deal with factors, as paste does or as the current
'+'
does?

The same as before. It would not change the behavior for other
classes, only basic characters.

Would number+string and string+number cause errors (as in current
'+' in R and python) or coerce both to strings (as in current R:paste and
in perl's '+').

Would cause errors, exactly as it does right now.

Having '+' work on all types of data can let improperly imported data
get further into the system before triggering an error.

Nobody is asking for this. Only characters, not all types of data.

I see lots of
errors
reported on this list that are due to read.table interpreting text as
character
strings instead of the numbers that the user expected.  Detecting that
error as early as possible is good.

Isn't that a problem with read.table then? Detecting it there would be
the earliest possible, no?

Gabor

[...]


        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to