I fully agree! General string interpolation opens a gaping security hole and is accompanied by all kinds of problems and decisions. What I envision instead is something like this:
f”hello {name}” Which gets parsed by R to this: (STRINTERPSXP (CHARSXP (PROMISE nil))) Basically, a new type of R language construct that still can be processed by packages (for customized interpolation like in cli etc.), with a default eval which is basically paste0(). The benefit here would be that this is eagerly parsed and syntactically checked, and that the promise code could carry a srcref. And of course, that you could pass an interpolated string expression lazily between frames without losing the environment etc… For more advanced applications, a low level string interpolation expression constructor could be provided (that could either parse a general string — at the user’s risk, or build it directly from expressions). — Taras > On 7 Dec 2021, at 12:06, Simon Urbanek <simon.urba...@r-project.org> wrote: > > > >> On Dec 7, 2021, at 22:09, Taras Zakharko <taras.zakha...@uzh.ch >> <mailto:taras.zakha...@uzh.ch>> wrote: >> >> Great summary, Avi. >> >> String concatenation cold be trivially added to R, but it probably should >> not be. You will notice that modern languages tend not to use “+” to do >> string concatenation (they either have >> a custom operator or a special kind of pattern to do it) due to practical >> issues such an approach brings (implicit type casting, lack of >> commutativity, performance etc.). These issues will be felt even more so in >> R with it’s weak typing, idiosyncratic casting behavior and NAs. >> >> As other’s have pointed out, any kind of behavior one wants from string >> concatenation can be implemented by custom operators as needed. This is not >> something that needs to be in the base R. I would rather like the efforts to >> be directed on improving string formatting (such as glue-style built-in >> string interpolation). >> > > This is getting OT, but there is a very good reason why string interpolation > is not in core R. As I recall it has been considered some time ago, but it is > very dangerous as it implies evaluation on constants which opens a huge > security hole and has questionable semantics (where you evaluate etc). Hence > it's much easier to ban a package than to hack it out of R ;). > > Cheers, > Simon > > >> — Taras >> >> >>> On 7 Dec 2021, at 02:27, Avi Gross via R-devel <r-devel@r-project.org> >>> wrote: >>> >>> After seeing what others are saying, it is clear that you need to carefully >>> think things out before designing any implementation of a more native >>> concatenation operator whether it is called "+' or anything else. There may >>> not be any ONE right solution but unlike a function version like paste() >>> there is nowhere to place any options that specify what you mean. >>> >>> You can obviously expand paste() to accept arguments like replace.NA="" or >>> replace.NA="<NA>" and similar arguments on what to do if you see a NaN, and >>> Inf or -Inf, a NULL or even an NA.character_ and so on. Heck, you might tell >>> to make other substitutions as in substitute=list(100=99, D=F) or any other >>> nonsense you can come up with. >>> >>> But you have nowhere to put options when saying: >>> >>> c <- a + b >>> >>> Sure, you could set various global options before the addition and maybe >>> rest them after, but that is not a way I like to go for something this >>> basic. >>> >>> And enough such tinkering makes me wonder if it is easier to ask a user to >>> use a slightly different function like this: >>> >>> paste.no.na <- function(...) do.call(paste, Filter(Negate(is.na), >>> list(...))) >>> >>> The above one-line function removes any NA from the argument list to make a >>> potentially shorter list before calling the real paste() using it. >>> >>> Variations can, of course, be made that allow functionality as above. >>> >>> If R was a true object-oriented language in the same sense as others like >>> Python, operator overloading of "+" might be doable in more complex ways but >>> we can only work with what we have. I tend to agree with others that in some >>> places R is so lenient that all kinds of errors can happen because it makes >>> a guess on how to correct it. Generally, if you really want to mix numeric >>> and character, many languages require you to transform any arguments to make >>> all of compatible types. The paste() function is clearly stated to coerce >>> all arguments to be of type character for you. Whereas a+b makes no such >>> promises and also is not properly defined even if a and b are both of type >>> character. Sure, we can expand the language but it may still do things some >>> find not to be quite what they wanted as in "2"+"3" becoming "23" rather >>> than 5. Right now, I can use as.numeric("2")+as.numeric("3") and get the >>> intended result after making very clear to anyone reading the code that I >>> wanted strings converted to floating point before the addition. >>> >>> As has been pointed out, the plus operator if used to concatenate does not >>> have a cognate for other operations like -*/ and R has used most other >>> special symbols for other purposes. So, sure, we can use something like .... >>> (4 periods) if it is not already being used for something but using + here >>> is a tad confusing. Having said that, the makers of Python did make that >>> choice. >>> >>> -----Original Message----- >>> From: R-devel <r-devel-boun...@r-project.org> On Behalf Of Gabriel Becker >>> Sent: Monday, December 6, 2021 7:21 PM >>> To: Bill Dunlap <williamwdun...@gmail.com> >>> Cc: Radford Neal <radf...@cs.toronto.edu>; r-devel <r-devel@r-project.org> >>> Subject: Re: [Rd] string concatenation operator (revisited) >>> >>> As I recall, there was a large discussion related to that which resulted in >>> the recycle0 argument being added (but defaulting to FALSE) for >>> paste/paste0. >>> >>> I think a lot of these things ultimately mean that if there were to be a >>> string concatenation operator, it probably shouldn't have behavior identical >>> to paste0. Was that what you were getting at as well, Bill? >>> >>> ~G >>> >>> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap <williamwdun...@gmail.com> wrote: >>> >>>> Should paste0(character(0), c("a","b")) give character(0)? >>>> There is a fair bit of code that assumes that paste("X",NULL) gives "X" >>>> but c(1,2)+NULL gives numeric(0). >>>> >>>> -Bill >>>> >>>> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch >>>> <murdoch.dun...@gmail.com> >>>> wrote: >>>> >>>>> On 06/12/2021 4:21 p.m., Avraham Adler wrote: >>>>>> Gabe, I agree that missingness is important to factor in. To >>>>>> somewhat >>>>> abuse >>>>>> the terminology, NA is often used to represent missingness. Perhaps >>>>>> concatenating character something with character something missing >>>>> should >>>>>> result in the original character? >>>>> >>>>> I think that's a bad idea. If you wanted to represent an empty >>>>> string, you should use "" or NULL, not NA. >>>>> >>>>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it >>>>> should give NA. >>>>> >>>>> Duncan Murdoch >>>>> >>>>>> >>>>>> Avi >>>>>> >>>>>> On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker >>>>>> <gabembec...@gmail.com> >>>>> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> Seeing this and the other thread (and admittedly not having >>>>>>> clicked >>>>> through >>>>>>> to the linked r-help thread), I wonder about NAs. >>>>>>> >>>>>>> Should NA <concat> "hi there" not result in NA_character_? This >>>>>>> is not what any of the paste functions do, but in my opinoin, NA + >>>>> <non_na_value> >>>>>>> seems like it should be NA (not "NA"), particularly if we are >>>>>>> talking about `+` overloading, but potentially even in the case of >>>>>>> a distinct concatenation operator? >>>>>>> >>>>>>> I guess what I'm saying is that in my head missingness propagation >>>>> rules >>>>>>> should take priority in such an operator (ie NA + <anything> >>>>>>> should *always * be NA). >>>>>>> >>>>>>> Is that something others disagree with, or has it just not come up >>>>>>> yet >>>>> in >>>>>>> (the parts I have read) of this discussion? >>>>>>> >>>>>>> Best, >>>>>>> ~G >>>>>>> >>>>>>> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal >>>>>>> <radf...@cs.toronto.edu> >>>>>>> wrote: >>>>>>> >>>>>>>>>> In pqR (see pqR-project.org), I have implemented ! and !! as >>>>>>>>>> binary string concatenation operators, equivalent to paste0 and >>>>>>>>>> paste, respectively. >>>>>>>>>> >>>>>>>>>> For instance, >>>>>>>>>> >>>>>>>>>>> "hello" ! "world" >>>>>>>>>> [1] "helloworld" >>>>>>>>>>> "hello" !! "world" >>>>>>>>>> [1] "hello world" >>>>>>>>>>> "hello" !! 1:4 >>>>>>>>>> [1] "hello 1" "hello 2" "hello 3" "hello 4" >>>>>>>>> >>>>>>>>> I'm curious about the details: >>>>>>>>> >>>>>>>>> Would `1 ! 2` convert both to strings? >>>>>>>> >>>>>>>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", >>>>>>>> just like paste0(1,2) does. Of course, they wouldn't have to be >>>>>>>> exactly equivalent to paste0 and paste - one could impose >>>>>>>> stricter requirements if that seemed better for error detection. >>>>>>>> Off hand, though, I think automatically converting is more in >>>>>>>> keeping with the rest of R. Explicitly converting with as.character >>> could be tedious. >>>>>>>> >>>>>>>> I suppose disallowing logical arguments might make sense to guard >>>>>>>> against typos where ! was meant to be the unary-not operator, but >>>>>>>> ended up being a binary operator, after some sort of typo. I >>>>>>>> doubt that this would be a common error, though. >>>>>>>> >>>>>>>> (Note that there's no ambiguity when there are no typos, except >>>>>>>> that when negation is involved a space may be needed - so, for >>>>>>>> example, "x" ! !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE". >>>>>>>> Existing uses of double negation are still fine - eg, a <- !!TRUE >>> still sets a to TRUE. >>>>>>>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not >>>>> "xTRUE".) >>>>>>>> >>>>>>>>> Where does the binary ! fit in the operator priority? E.g. how >>>>>>>>> is >>>>>>>>> >>>>>>>>> a ! b > c >>>>>>>>> >>>>>>>>> parsed? >>>>>>>> >>>>>>>> As (a ! b) > c. >>>>>>>> >>>>>>>> Their precedence is between that of + and - and that of < and >. >>>>>>>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE. >>>>>>>> >>>>>>>> (Actually, pqR also has a .. operator that fixes the problems >>>>>>>> with generating sequences with the : operator, and it has >>>>>>>> precedence lower than + and - and higher than ! and !!, but >>>>>>>> that's not relevant if you don't have the .. operator.) >>>>>>>> >>>>>>>> Radford Neal >>>>>>>> >>>>>>>> ______________________________________________ >>>>>>>> R-devel@r-project.org mailing list >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>>>>> >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-devel@r-project.org mailing list >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>>>> >>>>> >>>>> ______________________________________________ >>>>> R-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>> >>>> >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> ______________________________________________ >> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> <https://stat.ethz.ch/mailman/listinfo/r-devel> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel