Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
> On 16 Jun 2017, at 21:17 , Duncan Murdochwrote: > > paste0("this is the first part", >"this is the second part") > > If the rather insignificant amount of time it takes to execute this function > call really matters (and I'm not convinced of that), then shouldn't it be > solved by the compiler applying constant folding to paste0()? And, of course, if it is equivalent to a literal, it can be precomputed. There is no point in having it in the middle of a tight loop. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On 16/06/2017 2:04 PM, Radford Neal wrote: On Wed, 14 Jun 2017, G?bor Cs?rdi wrote: I like the idea of string literals, but the C/C++ way clearly does not work. The Python/Julia way might, i.e.: """this is a multi-line lineral""" luke-tier...@uiowa.edu: This does look like a promising option; some more careful checking would be needed to make sure there aren't cases where currently working code would be broken. I don't see how this proposal solves any problem of interest. String literals can already be as long as you like. The problem is that they will get wrapped around in an editor (or not all be visible), destroying the nice formatting of your program. With the proposed extension, you can write long string literals with short lines only if they were long only because they consisted of multiple lines. Getting a string literal that's 79 characters long with no newlines (a perfectly good error message, for example) to fit in your 80-character-wide editing window would still be impossible. Furthermore, these Python-style literals have to have their second and later lines start at the left edge, destroying the indentation of your program (supposing you actually wanted to use one). In contrast, C-style concatenation (by the parser) of consecutive string literals works just fine for what you'd want to do in a program. The only thing they wouldn't do that the Python-style literals would do is allow you to put big blocks of literal text in your program, without having to put quotes around each line. But shouldn't such text really be stored in a separate file that gets read, rather than in the program source? I agree with most of this, but I still don't see the need for a syntax change. That's a lot of work just to avoid typing "paste0" and some commas in paste0("this is the first part", "this is the second part") If the rather insignificant amount of time it takes to execute this function call really matters (and I'm not convinced of that), then shouldn't it be solved by the compiler applying constant folding to paste0()? (Some syntax like r"xyz" to make it easier to type strings containing backslashes and quotes would actually be useful, but that's a different issue.) Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On Fri, Jun 16, 2017 at 1:14 PM, Gábor Csárdiwrote: > On Fri, Jun 16, 2017 at 7:04 PM, Radford Neal wrote: >>> On Wed, 14 Jun 2017, G?bor Cs?rdi wrote: >>> >>> > I like the idea of string literals, but the C/C++ way clearly does not >>> > work. The Python/Julia way might, i.e.: >>> > >>> > """this is a >>> > multi-line >>> > lineral""" >>> >>> luke-tier...@uiowa.edu: >> >>> This does look like a promising option; some more careful checking >>> would be needed to make sure there aren't cases where currently >>> working code would be broken. >> >> I don't see how this proposal solves any problem of interest. >> >> String literals can already be as long as you like. The problem is >> that they will get wrapped around in an editor (or not all be >> visible), destroying the nice formatting of your program. > > From the Python docs: > > String literals can span multiple lines. One way is using > triple-quotes: """...""" or '''...'''. End of lines are automatically > included in the string, but it’s possible to prevent this by adding a > \ at the end of the line. And additionally, in Julia triple quoted strings: Trailing whitespace is left unaltered. They can contain " symbols without escaping. Triple-quoted strings are also dedented to the level of the least-indented line. This is useful for defining strings within code that is indented. For example: Hadley -- http://hadley.nz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On Fri, Jun 16, 2017 at 7:04 PM, Radford Nealwrote: >> On Wed, 14 Jun 2017, G?bor Cs?rdi wrote: >> >> > I like the idea of string literals, but the C/C++ way clearly does not >> > work. The Python/Julia way might, i.e.: >> > >> > """this is a >> > multi-line >> > lineral""" >> >> luke-tier...@uiowa.edu: > >> This does look like a promising option; some more careful checking >> would be needed to make sure there aren't cases where currently >> working code would be broken. > > I don't see how this proposal solves any problem of interest. > > String literals can already be as long as you like. The problem is > that they will get wrapped around in an editor (or not all be > visible), destroying the nice formatting of your program. From the Python docs: String literals can span multiple lines. One way is using triple-quotes: """...""" or '''...'''. End of lines are automatically included in the string, but it’s possible to prevent this by adding a \ at the end of the line. [...] Gabor __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
>> I don't think it is reasonable to change the parser this way. This is >> currently valid R code: >> >> a <- "foo" >> "bar" >> >> and with the new syntax, it is also valid, but with a different >> meaning. Or you can even consider >> >> a <- "foo" >> bar %>% func() %>% print() >> >> etc. >> >> I like the idea of string literals, but the C/C++ way clearly does not >> work. The Python/Julia way might, i.e.: >> >> """this is a >> multi-line >> lineral""" > > > This does look like a promising option; some more careful checking > would be needed to make sure there aren't cases where currently > working code would be broken. > > Another Python idea worth considering is the raw string notation > r"xyx" that does not process escape sequences -- this would make > writing things like regular expressions easier. If this is something you would consider, we'd be happy to put together a patch for review. Hadley -- http://hadley.nz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On Wed, 14 Jun 2017, Gábor Csárdi wrote: I don't think it is reasonable to change the parser this way. This is currently valid R code: a <- "foo" "bar" and with the new syntax, it is also valid, but with a different meaning. Or you can even consider a <- "foo" bar %>% func() %>% print() etc. I like the idea of string literals, but the C/C++ way clearly does not work. The Python/Julia way might, i.e.: """this is a multi-line lineral""" This does look like a promising option; some more careful checking would be needed to make sure there aren't cases where currently working code would be broken. Another Python idea worth considering is the raw string notation r"xyx" that does not process escape sequences -- this would make writing things like regular expressions easier. Best, luke Gabor On Wed, Jun 14, 2017 at 4:12 PM, William Dunlap via R-develwrote: If you are changing the parser (which is a major change) you might consider treating strings in the C/C++ way: char *s = "A" "B"; means the same as char *s = "AB"; I am not a big fan of that syntax but it is widely used. A backslash at the end of the line leads to errors when you accidently put a space after the backslash and the editor doesn't flag it. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Jun 14, 2017 at 3:58 AM, Andreas Kersting wrote: Hi, I would really like to have a way to split long string literals across multiple lines in R. Currently, if a string literal spans multiple lines, there is no way to inhibit the introduction of newline characters: "aaa + bbb" [1] "aaa\nbbb" If a line ends with a backslash, it is just ignored: "aaa\ + bbb" [1] "aaa\nbbb" We could use this fact to implement string splitting in a fairly backward-compatible way, since currently such trailing backslashes should hardly be used as they do not have any effect. The attached patch makes the parser ignore a newline character directly following a backslash: "aaa\ + bbb" [1] "aaabbb" I personally would also prefer if leading blanks (spaces and tabs) in the second line are ignored to allow for proper indentation: "aaa \ +bbb" [1] "aaa bbb" "aaa\ +\ bbb" [1] "aaa bbb" This is also implemented by this patch. An alternative approach could be to have something like ("aaa " "bbb") or ("aaa ", "bbb") be interpreted as "aaa bbb". I don't know the ins and outs of the parser of R (hence: please very carefully review the attached patch), but I guess this would be more work to implement!? What do you think? Is there anybody else who is missing this feature in the first place? Regards, Andreas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Original Message From: Hadley Wickham [mailto:h.wick...@gmail.com] Sent: Wednesday, Jun 14, 2017 2:51 PM GMT To: Simon Urbanek Cc: Andreas Kersting; r-devel@r-project.org Subject: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines On Wed, Jun 14, 2017 at 8:48 AM, Simon Urbanekwrote: As I recall this has been discussed at least a few times (unfortunately I'm traveling so can't check the references), but the justification was never satisfactory. Personally, I wouldn't mind string continuation supported since it makes for more readable code (I had one of my packages raise a NOTE in examples because there is no way in R to split a long hash into multiple lines), but I would be strongly against random removal of whitespaces as it's counter-intuitive, misleading and makes it impossible to continue spaces on the next line. None of the languages that I can think of with multiline strings do that as that's way too dangerous. Julia does, but uses triple quotes: https://docs.julialang.org/en/stable/manual/strings/#triple-quoted-string-literals Hadley If we consider bash a programming language: Here documents (http://tldp.org/LDP/abs/html/here-docs.html) can have leading tabs be removed (see Example 19-4). __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
I don't think it is reasonable to change the parser this way. This is currently valid R code: a <- "foo" "bar" and with the new syntax, it is also valid, but with a different meaning. Or you can even consider a <- "foo" bar %>% func() %>% print() etc. I like the idea of string literals, but the C/C++ way clearly does not work. The Python/Julia way might, i.e.: """this is a multi-line lineral""" Gabor On Wed, Jun 14, 2017 at 4:12 PM, William Dunlap via R-develwrote: > If you are changing the parser (which is a major change) you > might consider treating strings in the C/C++ way: >char *s = "A" >"B"; > means the same as >char *s = "AB"; > > I am not a big fan of that syntax but it is widely used. > > A backslash at the end of the line leads to errors when you accidently > put a space after the backslash and the editor doesn't flag it. > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Wed, Jun 14, 2017 at 3:58 AM, Andreas Kersting > wrote: > >> Hi, >> >> I would really like to have a way to split long string literals across >> multiple lines in R. >> >> Currently, if a string literal spans multiple lines, there is no way to >> inhibit the introduction of newline characters: >> >> > "aaa >> + bbb" >> [1] "aaa\nbbb" >> >> >> If a line ends with a backslash, it is just ignored: >> >> > "aaa\ >> + bbb" >> [1] "aaa\nbbb" >> >> >> We could use this fact to implement string splitting in a fairly >> backward-compatible way, since currently such trailing backslashes should >> hardly be used as they do not have any effect. The attached patch makes the >> parser ignore a newline character directly following a backslash: >> >> > "aaa\ >> + bbb" >> [1] "aaabbb" >> >> >> I personally would also prefer if leading blanks (spaces and tabs) in the >> second line are ignored to allow for proper indentation: >> >> > "aaa \ >> +bbb" >> [1] "aaa bbb" >> >> > "aaa\ >> +\ bbb" >> [1] "aaa bbb" >> >> This is also implemented by this patch. >> >> >> An alternative approach could be to have something like >> >> ("aaa " >> "bbb") >> >> or >> >> ("aaa ", >> "bbb") >> >> be interpreted as "aaa bbb". >> >> I don't know the ins and outs of the parser of R (hence: please very >> carefully review the attached patch), but I guess this would be more work >> to implement!? >> >> >> What do you think? Is there anybody else who is missing this feature in >> the first place? >> >> Regards, >> Andreas >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> > > [[alternative HTML version deleted]] > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
If you are changing the parser (which is a major change) you might consider treating strings in the C/C++ way: char *s = "A" "B"; means the same as char *s = "AB"; I am not a big fan of that syntax but it is widely used. A backslash at the end of the line leads to errors when you accidently put a space after the backslash and the editor doesn't flag it. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Jun 14, 2017 at 3:58 AM, Andreas Kerstingwrote: > Hi, > > I would really like to have a way to split long string literals across > multiple lines in R. > > Currently, if a string literal spans multiple lines, there is no way to > inhibit the introduction of newline characters: > > > "aaa > + bbb" > [1] "aaa\nbbb" > > > If a line ends with a backslash, it is just ignored: > > > "aaa\ > + bbb" > [1] "aaa\nbbb" > > > We could use this fact to implement string splitting in a fairly > backward-compatible way, since currently such trailing backslashes should > hardly be used as they do not have any effect. The attached patch makes the > parser ignore a newline character directly following a backslash: > > > "aaa\ > + bbb" > [1] "aaabbb" > > > I personally would also prefer if leading blanks (spaces and tabs) in the > second line are ignored to allow for proper indentation: > > > "aaa \ > +bbb" > [1] "aaa bbb" > > > "aaa\ > +\ bbb" > [1] "aaa bbb" > > This is also implemented by this patch. > > > An alternative approach could be to have something like > > ("aaa " > "bbb") > > or > > ("aaa ", > "bbb") > > be interpreted as "aaa bbb". > > I don't know the ins and outs of the parser of R (hence: please very > carefully review the attached patch), but I guess this would be more work > to implement!? > > > What do you think? Is there anybody else who is missing this feature in > the first place? > > Regards, > Andreas > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Original Message From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Wednesday, Jun 14, 2017 1:36 PM GMT To: Andreas Kersting Cc: r-devel Subject: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines On 14/06/2017 6:45 AM, Andreas Kersting wrote: On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdochwrote: On 14/06/2017 5:58 AM, Andreas Kersting wrote: Hi, I would really like to have a way to split long string literals across multiple lines in R. I don't understand why you require the string to be a literal. Why not construct the long string in an expression like paste0("aaa", "bbb") ? Surely the execution time of the paste0 call is negligible. Duncan Murdoch Actually "execution time" is precisely one of the reasons why I would like to see this feature as - depending on the context (e.g. in a tight loop) - the execution time of paste0 (or probably also glue, thanks Gabor) is not necessarily insignificant. You also need to consider implementation time. This is not just changes to R itself; trailing backslashes *are* used in some packages (e.g. geoparser), so those packages would need to be identified and modified and resubmitted to CRAN. I am totally with you on this "runtime vs. implementation-time"-issue. That is why I proposed the patch as I did: It seemed to require only minor changes to base R and I didn't see how it could be incompatible with existing code. Actually I can still not see how a package could have potentially *used* backslashes immediately followed by newlines up to now, since those backslashes were just ignored by the parser (And changes to the function StringValue are just about the parser, aren't they?). Of course I cannot rule out the possibility that there is code like var <- "aaa\ bbb" around, but this would be based on the undocumented(?) features that "backslash newline" is a valid escape sequence and that it is treated as "newline". Maybe its a good idea to show some more examples how the patched parser behaves. There should only be difference to the current implementation if a string literal spans multiple lines and a line ends in an odd number of backslashes (see last example): > "aaa\\ + bbb" [1] "aaa\\\nbbb" > "aaa\\nbbb" [1] "aaa\\nbbb" > "aaa\\\nbbb" [1] "aaa\\\nbbb" > "aaa\\" [1] "aaa\\" > "aaa\\\n" [1] "aaa\\\n" > "aaa" [1] "aaa" > "aaa\n" [1] "aaa\n" > "aaa + bbb" [1] "aaa\nbbb" > "aaa\\\ + bbb" [1] "aaa\\bbb" Andreas Core changes to existing behaviour need really strong arguments, and I'm just not seeing those here. Duncan Murdoch The other reason is style: I think it is cleaner if we can construct such a long string literal without the need for a function call. Andreas Currently, if a string literal spans multiple lines, there is no way to inhibit the introduction of newline characters: > "aaa + bbb" [1] "aaa\nbbb" If a line ends with a backslash, it is just ignored: > "aaa\ + bbb" [1] "aaa\nbbb" We could use this fact to implement string splitting in a fairly backward-compatible way, since currently such trailing backslashes should hardly be used as they do not have any effect. The attached patch makes the parser ignore a newline character directly following a backslash: > "aaa\ + bbb" [1] "aaabbb" I personally would also prefer if leading blanks (spaces and tabs) in the second line are ignored to allow for proper indentation: > "aaa \ +bbb" [1] "aaa bbb" > "aaa\ +\ bbb" [1] "aaa bbb" This is also implemented by this patch. An alternative approach could be to have something like ("aaa " "bbb") or ("aaa ", "bbb") be interpreted as "aaa bbb". I don't know the ins and outs of the parser of R (hence: please very carefully review the attached patch), but I guess this would be more work to implement!? What do you think? Is there anybody else who is missing this feature in the first place? Regards, Andreas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Le 14/06/2017 à 12:58, Andreas Kersting a écrit : Hi, I would really like to have a way to split long string literals across multiple lines in R. ... An alternative approach could be to have something like ("aaa " "bbb") This is C-style and if the core-team decides to implement it, it could be useful and intuitive. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On Wed, Jun 14, 2017 at 8:48 AM, Simon Urbanekwrote: > As I recall this has been discussed at least a few times (unfortunately I'm > traveling so can't check the references), but the justification was never > satisfactory. > > Personally, I wouldn't mind string continuation supported since it makes for > more readable code (I had one of my packages raise a NOTE in examples because > there is no way in R to split a long hash into multiple lines), but I would > be strongly against random removal of whitespaces as it's counter-intuitive, > misleading and makes it impossible to continue spaces on the next line. None > of the languages that I can think of with multiline strings do that as that's > way too dangerous. Julia does, but uses triple quotes: https://docs.julialang.org/en/stable/manual/strings/#triple-quoted-string-literals Hadley -- http://hadley.nz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
As I recall this has been discussed at least a few times (unfortunately I'm traveling so can't check the references), but the justification was never satisfactory. Personally, I wouldn't mind string continuation supported since it makes for more readable code (I had one of my packages raise a NOTE in examples because there is no way in R to split a long hash into multiple lines), but I would be strongly against random removal of whitespaces as it's counter-intuitive, misleading and makes it impossible to continue spaces on the next line. None of the languages that I can think of with multiline strings do that as that's way too dangerous. Cheers, Simon > On Jun 14, 2017, at 6:58 AM, Andreas Kerstingwrote: > > Hi, > > I would really like to have a way to split long string literals across > multiple lines in R. > > Currently, if a string literal spans multiple lines, there is no way to > inhibit the introduction of newline characters: > > > "aaa > + bbb" > [1] "aaa\nbbb" > > > If a line ends with a backslash, it is just ignored: > > > "aaa\ > + bbb" > [1] "aaa\nbbb" > > > We could use this fact to implement string splitting in a fairly > backward-compatible way, since currently such trailing backslashes should > hardly be used as they do not have any effect. The attached patch makes the > parser ignore a newline character directly following a backslash: > > > "aaa\ > + bbb" > [1] "aaabbb" > > > I personally would also prefer if leading blanks (spaces and tabs) in the > second line are ignored to allow for proper indentation: > > > "aaa \ > +bbb" > [1] "aaa bbb" > > > "aaa\ > +\ bbb" > [1] "aaa bbb" > > This is also implemented by this patch. > > > An alternative approach could be to have something like > > ("aaa " > "bbb") > > or > > ("aaa ", > "bbb") > > be interpreted as "aaa bbb". > > I don't know the ins and outs of the parser of R (hence: please very > carefully review the attached patch), but I guess this would be more work to > implement!? > > > What do you think? Is there anybody else who is missing this feature in the > first place? > > Regards, > Andreas > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On 14/06/2017 6:45 AM, Andreas Kersting wrote: On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdochwrote: On 14/06/2017 5:58 AM, Andreas Kersting wrote: Hi, I would really like to have a way to split long string literals across multiple lines in R. I don't understand why you require the string to be a literal. Why not construct the long string in an expression like paste0("aaa", "bbb") ? Surely the execution time of the paste0 call is negligible. Duncan Murdoch Actually "execution time" is precisely one of the reasons why I would like to see this feature as - depending on the context (e.g. in a tight loop) - the execution time of paste0 (or probably also glue, thanks Gabor) is not necessarily insignificant. You also need to consider implementation time. This is not just changes to R itself; trailing backslashes *are* used in some packages (e.g. geoparser), so those packages would need to be identified and modified and resubmitted to CRAN. Core changes to existing behaviour need really strong arguments, and I'm just not seeing those here. Duncan Murdoch The other reason is style: I think it is cleaner if we can construct such a long string literal without the need for a function call. Andreas Currently, if a string literal spans multiple lines, there is no way to inhibit the introduction of newline characters: > "aaa + bbb" [1] "aaa\nbbb" If a line ends with a backslash, it is just ignored: > "aaa\ + bbb" [1] "aaa\nbbb" We could use this fact to implement string splitting in a fairly backward-compatible way, since currently such trailing backslashes should hardly be used as they do not have any effect. The attached patch makes the parser ignore a newline character directly following a backslash: > "aaa\ + bbb" [1] "aaabbb" I personally would also prefer if leading blanks (spaces and tabs) in the second line are ignored to allow for proper indentation: > "aaa \ +bbb" [1] "aaa bbb" > "aaa\ +\ bbb" [1] "aaa bbb" This is also implemented by this patch. An alternative approach could be to have something like ("aaa " "bbb") or ("aaa ", "bbb") be interpreted as "aaa bbb". I don't know the ins and outs of the parser of R (hence: please very carefully review the attached patch), but I guess this would be more work to implement!? What do you think? Is there anybody else who is missing this feature in the first place? Regards, Andreas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Hi Mark, I got you. I just pointed out the obvious to illustrate why your emulation didn't eliminate the need for the real thing. I didn't mean to imply you weren't aware of this, even though it may seem so. Sometimes I'm not 100% aware of the subtleties of the English language. This seems one of those cases. Met vriendelijke groeten Joris On Wed, Jun 14, 2017 at 2:23 PM, Mark van der Loowrote: > I know it doesn't cause construction at parse time, and it was also not > what I said. What I meant was that it makes the syntax at least look a > little as if you have a line-breaking character within string literals. > > Op wo 14 jun. 2017 om 14:18 schreef Joris Meys : > >> Mark, that's actually a fair statement, although your extra operator >> doesn't cause construction at parse time. You still call paste0(), but just >> add an extra layer on top of it. >> >> I also doubt that even in gigantic loops the benefit is going to be >> significant. Take following example: >> >> atestfun <- function(x){ >> y <- paste0("a very long", >> "string for testing") >> grep(x, y) >> } >> atestfun2 <- function(x){ >> y <- "a very long >> string for testing" >> grep(x,y) >> } >> cfun <- cmpfun(atestfun) >> cfun2 <- cmpfun(atestfun2) >> >> require(rbenchmark) >> benchmark(atestfun("a"), >> atestfun2("a"), >> cfun("a"), >> cfun2("a"), >> replications = 10) >> >> Which gives after 100,000 replications: >> >> test replications elapsed relative >> 1 atestfun("a") 100.831.339 >> 2 atestfun2("a") 100.621.000 >> 3 cfun("a") 100.811.306 >> 4 cfun2("a") 100.621.000 >> >> The patch can in principle make similar code marginally faster, but I'm >> not convinced the patch is going to make any real difference except for in >> some very specific and exotic cases. Even more, calling a function like the >> examples inside the loop is the only way I can come up with where this >> might be a problem. If you just construct the string inside the loop, >> there's two possibilities: >> >> - the string does not need to change, and then you better construct it >> outside of the loop >> - the string does need to change, and then you need paste() or paste0() >> anyway >> >> I'm not against incorporating the patch, as it would eliminate a few >> keystrokes. It's a neat idea, but I don't expect any other noticeable >> advantage from it. >> >> my humble 2 cents >> Cheers >> Joris >> >> On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo < >> mark.vander...@gmail.com> wrote: >> >>> Having some line-breaking character for string literals would have >>> benefits >>> as string literals can then be constructed parse-time rather than >>> run-time. >>> I have run into this myself a few times as well. One way to at least >>> emulate something like that is the following. >>> >>> `%+%` <- function(x,y) paste0(x,y) >>> >>> "hello" %+% >>> " pretty" %+% >>> " world" >>> >>> >>> -Mark >>> >>> >>> >>> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting < >>> r-de...@akersting.de>: >>> >>> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < >>> > murdoch.dun...@gmail.com> wrote: >>> > >>> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: >>> > > > Hi, >>> > > > >>> > > > I would really like to have a way to split long string literals >>> across >>> > > > multiple lines in R. >>> > > >>> > > I don't understand why you require the string to be a literal. Why >>> not >>> > > construct the long string in an expression like >>> > > >>> > > paste0("aaa", >>> > > "bbb") >>> > > >>> > > ? Surely the execution time of the paste0 call is negligible. >>> > > >>> > > Duncan Murdoch >>> > >>> > Actually "execution time" is precisely one of the reasons why I would >>> like >>> > to see this feature as - depending on the context (e.g. in a tight >>> loop) - >>> > the execution time of paste0 (or probably also glue, thanks Gabor) is >>> not >>> > necessarily insignificant. >>> > >>> > The other reason is style: I think it is cleaner if we can construct >>> such >>> > a long string literal without the need for a function call. >>> > >>> > Andreas >>> > >>> > > > >>> > > > Currently, if a string literal spans multiple lines, there is no >>> way to >>> > > > inhibit the introduction of newline characters: >>> > > > >>> > > > > "aaa >>> > > > + bbb" >>> > > > [1] "aaa\nbbb" >>> > > > >>> > > > >>> > > > If a line ends with a backslash, it is just ignored: >>> > > > >>> > > > > "aaa\ >>> > > > + bbb" >>> > > > [1] "aaa\nbbb" >>> > > > >>> > > > >>> > > > We could use this fact to implement string splitting in a fairly >>> > > > backward-compatible way, since currently such trailing backslashes >>> > > > should hardly be used as they do not have any effect. The attached >>> > patch >>> > > > makes the parser ignore a newline character directly following a >>> >
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
I know it doesn't cause construction at parse time, and it was also not what I said. What I meant was that it makes the syntax at least look a little as if you have a line-breaking character within string literals. Op wo 14 jun. 2017 om 14:18 schreef Joris Meys: > Mark, that's actually a fair statement, although your extra operator > doesn't cause construction at parse time. You still call paste0(), but just > add an extra layer on top of it. > > I also doubt that even in gigantic loops the benefit is going to be > significant. Take following example: > > atestfun <- function(x){ > y <- paste0("a very long", > "string for testing") > grep(x, y) > } > atestfun2 <- function(x){ > y <- "a very long > string for testing" > grep(x,y) > } > cfun <- cmpfun(atestfun) > cfun2 <- cmpfun(atestfun2) > > require(rbenchmark) > benchmark(atestfun("a"), > atestfun2("a"), > cfun("a"), > cfun2("a"), > replications = 10) > > Which gives after 100,000 replications: > > test replications elapsed relative > 1 atestfun("a") 100.831.339 > 2 atestfun2("a") 100.621.000 > 3 cfun("a") 100.811.306 > 4 cfun2("a") 100.621.000 > > The patch can in principle make similar code marginally faster, but I'm > not convinced the patch is going to make any real difference except for in > some very specific and exotic cases. Even more, calling a function like the > examples inside the loop is the only way I can come up with where this > might be a problem. If you just construct the string inside the loop, > there's two possibilities: > > - the string does not need to change, and then you better construct it > outside of the loop > - the string does need to change, and then you need paste() or paste0() > anyway > > I'm not against incorporating the patch, as it would eliminate a few > keystrokes. It's a neat idea, but I don't expect any other noticeable > advantage from it. > > my humble 2 cents > Cheers > Joris > > On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loo < > mark.vander...@gmail.com> wrote: > >> Having some line-breaking character for string literals would have >> benefits >> as string literals can then be constructed parse-time rather than >> run-time. >> I have run into this myself a few times as well. One way to at least >> emulate something like that is the following. >> >> `%+%` <- function(x,y) paste0(x,y) >> >> "hello" %+% >> " pretty" %+% >> " world" >> >> >> -Mark >> >> >> >> Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting < >> r-de...@akersting.de>: >> >> > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < >> > murdoch.dun...@gmail.com> wrote: >> > >> > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: >> > > > Hi, >> > > > >> > > > I would really like to have a way to split long string literals >> across >> > > > multiple lines in R. >> > > >> > > I don't understand why you require the string to be a literal. Why >> not >> > > construct the long string in an expression like >> > > >> > > paste0("aaa", >> > > "bbb") >> > > >> > > ? Surely the execution time of the paste0 call is negligible. >> > > >> > > Duncan Murdoch >> > >> > Actually "execution time" is precisely one of the reasons why I would >> like >> > to see this feature as - depending on the context (e.g. in a tight >> loop) - >> > the execution time of paste0 (or probably also glue, thanks Gabor) is >> not >> > necessarily insignificant. >> > >> > The other reason is style: I think it is cleaner if we can construct >> such >> > a long string literal without the need for a function call. >> > >> > Andreas >> > >> > > > >> > > > Currently, if a string literal spans multiple lines, there is no >> way to >> > > > inhibit the introduction of newline characters: >> > > > >> > > > > "aaa >> > > > + bbb" >> > > > [1] "aaa\nbbb" >> > > > >> > > > >> > > > If a line ends with a backslash, it is just ignored: >> > > > >> > > > > "aaa\ >> > > > + bbb" >> > > > [1] "aaa\nbbb" >> > > > >> > > > >> > > > We could use this fact to implement string splitting in a fairly >> > > > backward-compatible way, since currently such trailing backslashes >> > > > should hardly be used as they do not have any effect. The attached >> > patch >> > > > makes the parser ignore a newline character directly following a >> > backslash: >> > > > >> > > > > "aaa\ >> > > > + bbb" >> > > > [1] "aaabbb" >> > > > >> > > > >> > > > I personally would also prefer if leading blanks (spaces and tabs) >> in >> > > > the second line are ignored to allow for proper indentation: >> > > > >> > > > > "aaa \ >> > > > +bbb" >> > > > [1] "aaa bbb" >> > > > >> > > > > "aaa\ >> > > > +\ bbb" >> > > > [1] "aaa bbb" >> > > > >> > > > This is also implemented by this patch. >> > > > >> > > > >> > > > An alternative approach could be to have something like >> > > > >> > > > ("aaa " >> > > > "bbb") >> > > > >>
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Mark, that's actually a fair statement, although your extra operator doesn't cause construction at parse time. You still call paste0(), but just add an extra layer on top of it. I also doubt that even in gigantic loops the benefit is going to be significant. Take following example: atestfun <- function(x){ y <- paste0("a very long", "string for testing") grep(x, y) } atestfun2 <- function(x){ y <- "a very long string for testing" grep(x,y) } cfun <- cmpfun(atestfun) cfun2 <- cmpfun(atestfun2) require(rbenchmark) benchmark(atestfun("a"), atestfun2("a"), cfun("a"), cfun2("a"), replications = 10) Which gives after 100,000 replications: test replications elapsed relative 1 atestfun("a") 100.831.339 2 atestfun2("a") 100.621.000 3 cfun("a") 100.811.306 4 cfun2("a") 100.621.000 The patch can in principle make similar code marginally faster, but I'm not convinced the patch is going to make any real difference except for in some very specific and exotic cases. Even more, calling a function like the examples inside the loop is the only way I can come up with where this might be a problem. If you just construct the string inside the loop, there's two possibilities: - the string does not need to change, and then you better construct it outside of the loop - the string does need to change, and then you need paste() or paste0() anyway I'm not against incorporating the patch, as it would eliminate a few keystrokes. It's a neat idea, but I don't expect any other noticeable advantage from it. my humble 2 cents Cheers Joris On Wed, Jun 14, 2017 at 2:00 PM, Mark van der Loowrote: > Having some line-breaking character for string literals would have benefits > as string literals can then be constructed parse-time rather than run-time. > I have run into this myself a few times as well. One way to at least > emulate something like that is the following. > > `%+%` <- function(x,y) paste0(x,y) > > "hello" %+% > " pretty" %+% > " world" > > > -Mark > > > > Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting >: > > > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < > > murdoch.dun...@gmail.com> wrote: > > > > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: > > > > Hi, > > > > > > > > I would really like to have a way to split long string literals > across > > > > multiple lines in R. > > > > > > I don't understand why you require the string to be a literal. Why not > > > construct the long string in an expression like > > > > > > paste0("aaa", > > > "bbb") > > > > > > ? Surely the execution time of the paste0 call is negligible. > > > > > > Duncan Murdoch > > > > Actually "execution time" is precisely one of the reasons why I would > like > > to see this feature as - depending on the context (e.g. in a tight loop) > - > > the execution time of paste0 (or probably also glue, thanks Gabor) is not > > necessarily insignificant. > > > > The other reason is style: I think it is cleaner if we can construct such > > a long string literal without the need for a function call. > > > > Andreas > > > > > > > > > > Currently, if a string literal spans multiple lines, there is no way > to > > > > inhibit the introduction of newline characters: > > > > > > > > > "aaa > > > > + bbb" > > > > [1] "aaa\nbbb" > > > > > > > > > > > > If a line ends with a backslash, it is just ignored: > > > > > > > > > "aaa\ > > > > + bbb" > > > > [1] "aaa\nbbb" > > > > > > > > > > > > We could use this fact to implement string splitting in a fairly > > > > backward-compatible way, since currently such trailing backslashes > > > > should hardly be used as they do not have any effect. The attached > > patch > > > > makes the parser ignore a newline character directly following a > > backslash: > > > > > > > > > "aaa\ > > > > + bbb" > > > > [1] "aaabbb" > > > > > > > > > > > > I personally would also prefer if leading blanks (spaces and tabs) in > > > > the second line are ignored to allow for proper indentation: > > > > > > > > > "aaa \ > > > > +bbb" > > > > [1] "aaa bbb" > > > > > > > > > "aaa\ > > > > +\ bbb" > > > > [1] "aaa bbb" > > > > > > > > This is also implemented by this patch. > > > > > > > > > > > > An alternative approach could be to have something like > > > > > > > > ("aaa " > > > > "bbb") > > > > > > > > or > > > > > > > > ("aaa ", > > > > "bbb") > > > > > > > > be interpreted as "aaa bbb". > > > > > > > > I don't know the ins and outs of the parser of R (hence: please very > > > > carefully review the attached patch), but I guess this would be more > > > > work to implement!? > > > > > > > > > > > > What do you think? Is there anybody else who is missing this feature > in > > > > the first place? > > > > > > > > Regards, > > > > Andreas > > > > > > > > > > > > > > > >
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
Having some line-breaking character for string literals would have benefits as string literals can then be constructed parse-time rather than run-time. I have run into this myself a few times as well. One way to at least emulate something like that is the following. `%+%` <- function(x,y) paste0(x,y) "hello" %+% " pretty" %+% " world" -Mark Op wo 14 jun. 2017 om 13:53 schreef Andreas Kersting: > On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdoch < > murdoch.dun...@gmail.com> wrote: > > > On 14/06/2017 5:58 AM, Andreas Kersting wrote: > > > Hi, > > > > > > I would really like to have a way to split long string literals across > > > multiple lines in R. > > > > I don't understand why you require the string to be a literal. Why not > > construct the long string in an expression like > > > > paste0("aaa", > > "bbb") > > > > ? Surely the execution time of the paste0 call is negligible. > > > > Duncan Murdoch > > Actually "execution time" is precisely one of the reasons why I would like > to see this feature as - depending on the context (e.g. in a tight loop) - > the execution time of paste0 (or probably also glue, thanks Gabor) is not > necessarily insignificant. > > The other reason is style: I think it is cleaner if we can construct such > a long string literal without the need for a function call. > > Andreas > > > > > > > Currently, if a string literal spans multiple lines, there is no way to > > > inhibit the introduction of newline characters: > > > > > > > "aaa > > > + bbb" > > > [1] "aaa\nbbb" > > > > > > > > > If a line ends with a backslash, it is just ignored: > > > > > > > "aaa\ > > > + bbb" > > > [1] "aaa\nbbb" > > > > > > > > > We could use this fact to implement string splitting in a fairly > > > backward-compatible way, since currently such trailing backslashes > > > should hardly be used as they do not have any effect. The attached > patch > > > makes the parser ignore a newline character directly following a > backslash: > > > > > > > "aaa\ > > > + bbb" > > > [1] "aaabbb" > > > > > > > > > I personally would also prefer if leading blanks (spaces and tabs) in > > > the second line are ignored to allow for proper indentation: > > > > > > > "aaa \ > > > +bbb" > > > [1] "aaa bbb" > > > > > > > "aaa\ > > > +\ bbb" > > > [1] "aaa bbb" > > > > > > This is also implemented by this patch. > > > > > > > > > An alternative approach could be to have something like > > > > > > ("aaa " > > > "bbb") > > > > > > or > > > > > > ("aaa ", > > > "bbb") > > > > > > be interpreted as "aaa bbb". > > > > > > I don't know the ins and outs of the parser of R (hence: please very > > > carefully review the attached patch), but I guess this would be more > > > work to implement!? > > > > > > > > > What do you think? Is there anybody else who is missing this feature in > > > the first place? > > > > > > Regards, > > > Andreas > > > > > > > > > > > > __ > > > R-devel@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On Wed, 14 Jun 2017 06:12:09 -0500, Duncan Murdochwrote: > On 14/06/2017 5:58 AM, Andreas Kersting wrote: > > Hi, > > > > I would really like to have a way to split long string literals across > > multiple lines in R. > > I don't understand why you require the string to be a literal. Why not > construct the long string in an expression like > > paste0("aaa", > "bbb") > > ? Surely the execution time of the paste0 call is negligible. > > Duncan Murdoch Actually "execution time" is precisely one of the reasons why I would like to see this feature as - depending on the context (e.g. in a tight loop) - the execution time of paste0 (or probably also glue, thanks Gabor) is not necessarily insignificant. The other reason is style: I think it is cleaner if we can construct such a long string literal without the need for a function call. Andreas > > > > Currently, if a string literal spans multiple lines, there is no way to > > inhibit the introduction of newline characters: > > > > > "aaa > > + bbb" > > [1] "aaa\nbbb" > > > > > > If a line ends with a backslash, it is just ignored: > > > > > "aaa\ > > + bbb" > > [1] "aaa\nbbb" > > > > > > We could use this fact to implement string splitting in a fairly > > backward-compatible way, since currently such trailing backslashes > > should hardly be used as they do not have any effect. The attached patch > > makes the parser ignore a newline character directly following a backslash: > > > > > "aaa\ > > + bbb" > > [1] "aaabbb" > > > > > > I personally would also prefer if leading blanks (spaces and tabs) in > > the second line are ignored to allow for proper indentation: > > > > > "aaa \ > > +bbb" > > [1] "aaa bbb" > > > > > "aaa\ > > +\ bbb" > > [1] "aaa bbb" > > > > This is also implemented by this patch. > > > > > > An alternative approach could be to have something like > > > > ("aaa " > > "bbb") > > > > or > > > > ("aaa ", > > "bbb") > > > > be interpreted as "aaa bbb". > > > > I don't know the ins and outs of the parser of R (hence: please very > > carefully review the attached patch), but I guess this would be more > > work to implement!? > > > > > > What do you think? Is there anybody else who is missing this feature in > > the first place? > > > > Regards, > > Andreas > > > > > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On Wed, Jun 14, 2017 at 12:12 PM, Duncan Murdochwrote: > On 14/06/2017 5:58 AM, Andreas Kersting wrote: >> >> Hi, >> >> I would really like to have a way to split long string literals across >> multiple lines in R. You can also look at the glue package, it supports continuation and a lot more: glue(" A formatted string \\ can also be on a \\ single line ") #> A formatted string can also be on a single line Gabor [...] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [WISH / PATCH] possibility to split string literals across multiple lines
On 14/06/2017 5:58 AM, Andreas Kersting wrote: Hi, I would really like to have a way to split long string literals across multiple lines in R. I don't understand why you require the string to be a literal. Why not construct the long string in an expression like paste0("aaa", "bbb") ? Surely the execution time of the paste0 call is negligible. Duncan Murdoch Currently, if a string literal spans multiple lines, there is no way to inhibit the introduction of newline characters: > "aaa + bbb" [1] "aaa\nbbb" If a line ends with a backslash, it is just ignored: > "aaa\ + bbb" [1] "aaa\nbbb" We could use this fact to implement string splitting in a fairly backward-compatible way, since currently such trailing backslashes should hardly be used as they do not have any effect. The attached patch makes the parser ignore a newline character directly following a backslash: > "aaa\ + bbb" [1] "aaabbb" I personally would also prefer if leading blanks (spaces and tabs) in the second line are ignored to allow for proper indentation: > "aaa \ +bbb" [1] "aaa bbb" > "aaa\ +\ bbb" [1] "aaa bbb" This is also implemented by this patch. An alternative approach could be to have something like ("aaa " "bbb") or ("aaa ", "bbb") be interpreted as "aaa bbb". I don't know the ins and outs of the parser of R (hence: please very carefully review the attached patch), but I guess this would be more work to implement!? What do you think? Is there anybody else who is missing this feature in the first place? Regards, Andreas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel