I fully agree! General string interpolation opens a gaping security hole and is 
accompanied by all kinds of problems and decisions. What I envision instead is 
something like this:

   f”hello {name}” 

Which gets parsed by R to this:

   (STRINTERPSXP (CHARSXP (PROMISE nil)))

Basically, a new type of R language construct that still can be processed by 
packages (for customized interpolation like in cli etc.), with a default eval 
which is basically paste0(). The benefit here would be that this is eagerly 
parsed and syntactically checked, and that the promise code could carry a 
srcref. And of course, that you could pass an interpolated string expression 
lazily between frames without losing the environment etc… For more advanced 
applications, a low level string interpolation expression constructor could be 
provided (that could either parse a general string — at the user’s risk, or 
build it directly from expressions). 

— Taras


> On 7 Dec 2021, at 12:06, Simon Urbanek <simon.urba...@r-project.org> wrote:
> 
> 
> 
>> On Dec 7, 2021, at 22:09, Taras Zakharko <taras.zakha...@uzh.ch 
>> <mailto:taras.zakha...@uzh.ch>> wrote:
>> 
>> Great summary, Avi. 
>> 
>> String concatenation cold be trivially added to R, but it probably should 
>> not be. You will notice that modern languages tend not to use “+” to do 
>> string concatenation (they either have 
>> a custom operator or a special kind of pattern to do it) due to practical 
>> issues such an approach brings (implicit type casting, lack of 
>> commutativity, performance etc.). These issues will be felt even more so in 
>> R with it’s weak typing, idiosyncratic casting behavior and NAs. 
>> 
>> As other’s have pointed out, any kind of behavior one wants from string 
>> concatenation can be implemented by custom operators as needed. This is not 
>> something that needs to be in the base R. I would rather like the efforts to 
>> be directed on improving string formatting (such as glue-style built-in 
>> string interpolation).
>> 
> 
> This is getting OT, but there is a very good reason why string interpolation 
> is not in core R. As I recall it has been considered some time ago, but it is 
> very dangerous as it implies evaluation on constants which opens a huge 
> security hole and has questionable semantics (where you evaluate etc). Hence 
> it's much easier to ban a package than to hack it out of R ;).
> 
> Cheers,
> Simon
> 
> 
>> — Taras
>> 
>> 
>>> On 7 Dec 2021, at 02:27, Avi Gross via R-devel <r-devel@r-project.org> 
>>> wrote:
>>> 
>>> After seeing what others are saying, it is clear that you need to carefully
>>> think things out before designing any implementation of a more native
>>> concatenation operator whether it is called "+' or anything else. There may
>>> not be any ONE right solution but unlike a function version like paste()
>>> there is nowhere to place any options that specify what you mean.
>>> 
>>> You can obviously expand paste() to accept arguments like replace.NA="" or
>>> replace.NA="<NA>" and similar arguments on what to do if you see a NaN, and
>>> Inf or -Inf, a NULL or even an NA.character_ and so on. Heck, you might tell
>>> to make other substitutions as in substitute=list(100=99, D=F) or any other
>>> nonsense you can come up with.
>>> 
>>> But you have nowhere to put options when saying:
>>> 
>>> c <- a + b
>>> 
>>> Sure, you could set various global options before the addition and maybe
>>> rest them after, but that is not a way I like to go for something this
>>> basic.
>>> 
>>> And enough such tinkering makes me wonder if it is easier to ask a user to
>>> use a slightly different function like this:
>>> 
>>> paste.no.na <- function(...) do.call(paste, Filter(Negate(is.na),
>>> list(...)))
>>> 
>>> The above one-line function removes any NA from the argument list to make a
>>> potentially shorter list before calling the real paste() using it.
>>> 
>>> Variations can, of course, be made that allow functionality as above. 
>>> 
>>> If R was a true object-oriented language in the same sense as others like
>>> Python, operator overloading of "+" might be doable in more complex ways but
>>> we can only work with what we have. I tend to agree with others that in some
>>> places R is so lenient that all kinds of errors can happen because it makes
>>> a guess on how to correct it. Generally, if you really want to mix numeric
>>> and character, many languages require you to transform any arguments to make
>>> all of compatible types. The paste() function is clearly stated to coerce
>>> all arguments to be of type character for you. Whereas a+b makes no such
>>> promises and also is not properly defined even if a and b are both of type
>>> character. Sure, we can expand the language but it may still do things some
>>> find not to be quite what they wanted as in "2"+"3" becoming "23" rather
>>> than 5. Right now, I can use as.numeric("2")+as.numeric("3") and get the
>>> intended result after making very clear to anyone reading the code that I
>>> wanted strings converted to floating point before the addition.
>>> 
>>> As has been pointed out, the plus operator if used to concatenate does not
>>> have a cognate for other operations like -*/ and R has used most other
>>> special symbols for other purposes. So, sure, we can use something like ....
>>> (4 periods) if it is not already being used for something but using + here
>>> is a tad confusing. Having said that, the makers of Python did make that
>>> choice.
>>> 
>>> -----Original Message-----
>>> From: R-devel <r-devel-boun...@r-project.org> On Behalf Of Gabriel Becker
>>> Sent: Monday, December 6, 2021 7:21 PM
>>> To: Bill Dunlap <williamwdun...@gmail.com>
>>> Cc: Radford Neal <radf...@cs.toronto.edu>; r-devel <r-devel@r-project.org>
>>> Subject: Re: [Rd] string concatenation operator (revisited)
>>> 
>>> As I recall, there was a large discussion related to that which resulted in
>>> the recycle0 argument being added (but defaulting to FALSE) for
>>> paste/paste0.
>>> 
>>> I think a lot of these things ultimately mean that if there were to be a
>>> string concatenation operator, it probably shouldn't have behavior identical
>>> to paste0. Was that what you were getting at as well, Bill?
>>> 
>>> ~G
>>> 
>>> On Mon, Dec 6, 2021 at 4:11 PM Bill Dunlap <williamwdun...@gmail.com> wrote:
>>> 
>>>> Should paste0(character(0), c("a","b")) give character(0)?
>>>> There is a fair bit of code that assumes that paste("X",NULL) gives "X"
>>>> but c(1,2)+NULL gives numeric(0).
>>>> 
>>>> -Bill
>>>> 
>>>> On Mon, Dec 6, 2021 at 1:32 PM Duncan Murdoch 
>>>> <murdoch.dun...@gmail.com>
>>>> wrote:
>>>> 
>>>>> On 06/12/2021 4:21 p.m., Avraham Adler wrote:
>>>>>> Gabe, I agree that missingness is important to factor in. To 
>>>>>> somewhat
>>>>> abuse
>>>>>> the terminology, NA is often used to represent missingness. Perhaps 
>>>>>> concatenating character something with character something missing
>>>>> should
>>>>>> result in the original character?
>>>>> 
>>>>> I think that's a bad idea.  If you wanted to represent an empty 
>>>>> string, you should use "" or NULL, not NA.
>>>>> 
>>>>> I'd agree with Gabe, paste0("abc", NA) shouldn't give "abcNA", it 
>>>>> should give NA.
>>>>> 
>>>>> Duncan Murdoch
>>>>> 
>>>>>> 
>>>>>> Avi
>>>>>> 
>>>>>> On Mon, Dec 6, 2021 at 3:35 PM Gabriel Becker 
>>>>>> <gabembec...@gmail.com>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi All,
>>>>>>> 
>>>>>>> Seeing this and the other thread (and admittedly not having 
>>>>>>> clicked
>>>>> through
>>>>>>> to the linked r-help thread), I wonder about NAs.
>>>>>>> 
>>>>>>> Should NA <concat> "hi there"  not result in NA_character_? This 
>>>>>>> is not what any of the paste functions do, but in my opinoin, NA +
>>>>> <non_na_value>
>>>>>>> seems like it should be NA  (not "NA"), particularly if we are 
>>>>>>> talking about `+` overloading, but potentially even in the case of 
>>>>>>> a distinct concatenation operator?
>>>>>>> 
>>>>>>> I guess what I'm saying is that in my head missingness propagation
>>>>> rules
>>>>>>> should take priority in such an operator (ie NA + <anything> 
>>>>>>> should *always * be NA).
>>>>>>> 
>>>>>>> Is that something others disagree with, or has it just not come up 
>>>>>>> yet
>>>>> in
>>>>>>> (the parts I have read) of this discussion?
>>>>>>> 
>>>>>>> Best,
>>>>>>> ~G
>>>>>>> 
>>>>>>> On Mon, Dec 6, 2021 at 10:03 AM Radford Neal 
>>>>>>> <radf...@cs.toronto.edu>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>>>> In pqR (see pqR-project.org), I have implemented ! and !! as 
>>>>>>>>>> binary string concatenation operators, equivalent to paste0 and 
>>>>>>>>>> paste, respectively.
>>>>>>>>>> 
>>>>>>>>>> For instance,
>>>>>>>>>> 
>>>>>>>>>>> "hello" ! "world"
>>>>>>>>>>    [1] "helloworld"
>>>>>>>>>>> "hello" !! "world"
>>>>>>>>>>    [1] "hello world"
>>>>>>>>>>> "hello" !! 1:4
>>>>>>>>>>    [1] "hello 1" "hello 2" "hello 3" "hello 4"
>>>>>>>>> 
>>>>>>>>> I'm curious about the details:
>>>>>>>>> 
>>>>>>>>> Would `1 ! 2` convert both to strings?
>>>>>>>> 
>>>>>>>> They're equivalent to paste0 and paste, so 1 ! 2 produces "12", 
>>>>>>>> just like paste0(1,2) does.  Of course, they wouldn't have to be 
>>>>>>>> exactly equivalent to paste0 and paste - one could impose 
>>>>>>>> stricter requirements if that seemed better for error detection.  
>>>>>>>> Off hand, though, I think automatically converting is more in 
>>>>>>>> keeping with the rest of R.  Explicitly converting with as.character
>>> could be tedious.
>>>>>>>> 
>>>>>>>> I suppose disallowing logical arguments might make sense to guard 
>>>>>>>> against typos where ! was meant to be the unary-not operator, but 
>>>>>>>> ended up being a binary operator, after some sort of typo.  I 
>>>>>>>> doubt that this would be a common error, though.
>>>>>>>> 
>>>>>>>> (Note that there's no ambiguity when there are no typos, except 
>>>>>>>> that when negation is involved a space may be needed - so, for 
>>>>>>>> example, "x" !  !TRUE is "xFALSE", but "x"!!TRUE is "x TRUE".  
>>>>>>>> Existing uses of double negation are still fine - eg, a <- !!TRUE
>>> still sets a to TRUE.
>>>>>>>> Parsing of operators is greedy, so "x"!!!TRUE is "x FALSE", not
>>>>> "xTRUE".)
>>>>>>>> 
>>>>>>>>> Where does the binary ! fit in the operator priority?  E.g. how 
>>>>>>>>> is
>>>>>>>>> 
>>>>>>>>> a ! b > c
>>>>>>>>> 
>>>>>>>>> parsed?
>>>>>>>> 
>>>>>>>> As (a ! b) > c.
>>>>>>>> 
>>>>>>>> Their precedence is between that of + and - and that of < and >.
>>>>>>>> So "x" ! 1+2 evalates to "x3" and "x" ! 1+2 < "x4" is TRUE.
>>>>>>>> 
>>>>>>>> (Actually, pqR also has a .. operator that fixes the problems 
>>>>>>>> with generating sequences with the : operator, and it has 
>>>>>>>> precedence lower than + and - and higher than ! and !!, but 
>>>>>>>> that's not relevant if you don't have the .. operator.)
>>>>>>>> 
>>>>>>>>  Radford Neal
>>>>>>>> 
>>>>>>>> ______________________________________________
>>>>>>>> R-devel@r-project.org mailing list 
>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>>> 
>>>>>>> 
>>>>>>>       [[alternative HTML version deleted]]
>>>>>>> 
>>>>>>> ______________________________________________
>>>>>>> R-devel@r-project.org mailing list 
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>>> 
>>>>> 
>>>>> ______________________________________________
>>>>> R-devel@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>> 
>>>> 
>>> 
>>>     [[alternative HTML version deleted]]
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>> ______________________________________________
>> R-devel@r-project.org <mailto:R-devel@r-project.org> mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
>> <https://stat.ethz.ch/mailman/listinfo/r-devel>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to