Re: [R] separate commands by semicolon
Oh yes, completely forgot about partial parsing. One possible (quick) solution: txt <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" sf <- srcfile("txt") tryit <- tryCatch(parse(text = txt, srcfile = sf), error = identity) gpd <- getParseData(sf) pos <- c(0, gpd$col1[gpd$token == "';'"], nchar(txt) + 1) final <- c() for (i in seq(length(pos) - 1)) { final <- c(final, substr(txt, pos[i] + 1, pos[i + 1] - 1)) } Which outputs: [1] "print(2)" " bar <- \"don't ; use semicolons\"" [3] " foo <- '3;4'" " ls(" Excellent, thanks very much, Adrian On Mon, Sep 19, 2016 at 3:19 PM, Duncan Murdochwrote: > On 19/09/2016 7:59 AM, Adrian Dușa wrote: > >> On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < >> peter.langfel...@gmail.com> wrote: >> >> > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius < >> dwinsem...@comcast.net> >> > wrote: >> > > Not entirely clear. If you were intending to just get character output >> > then you could just use: >> > > >> > > strsplit(txt, ";") >> > >> > You would want to avoid splitting within character strings >> > (print(";")) and in comments (print(2); ls() # This prints 2; then >> > lists...) The comment char could also appear in a character string, >> > where it does not mean the start of a comment... >> >> >> Yes, that would be the problem. >> Returning to my original post, modifying the example: >> >> x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" >> >> This should result in a character vector of length 4: >> [1] "print(2)" "bar <- \"don't ; use >> semicolons\"" >> [3] "foo <- '3;4'" "ls(" >> >> even though the last command would cause an error using parse(text = x) >> >> Perhaps this is not that important (I am trying to simulate a normal R >> console), and parse only if it syntactically correct. >> I was merely curious if this could be done, likely using regular >> expressions (surely strsplit doesn't solve it). >> >> Best, >> Adrian >> >> See the section on "partial parsing" in the ?parse help page. > > Duncan Murdoch > > -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
On 19/09/2016 7:59 AM, Adrian Dușa wrote: On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < peter.langfel...@gmail.com> wrote: > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius> wrote: > > Not entirely clear. If you were intending to just get character output > then you could just use: > > > > strsplit(txt, ";") > > You would want to avoid splitting within character strings > (print(";")) and in comments (print(2); ls() # This prints 2; then > lists...) The comment char could also appear in a character string, > where it does not mean the start of a comment... Yes, that would be the problem. Returning to my original post, modifying the example: x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" This should result in a character vector of length 4: [1] "print(2)" "bar <- \"don't ; use semicolons\"" [3] "foo <- '3;4'" "ls(" even though the last command would cause an error using parse(text = x) Perhaps this is not that important (I am trying to simulate a normal R console), and parse only if it syntactically correct. I was merely curious if this could be done, likely using regular expressions (surely strsplit doesn't solve it). Best, Adrian See the section on "partial parsing" in the ?parse help page. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
On Sun, Sep 18, 2016 at 12:34 AM, Peter Langfelder < peter.langfel...@gmail.com> wrote: > On Sat, Sep 17, 2016 at 2:12 PM, David Winsemius> wrote: > > Not entirely clear. If you were intending to just get character output > then you could just use: > > > > strsplit(txt, ";") > > You would want to avoid splitting within character strings > (print(";")) and in comments (print(2); ls() # This prints 2; then > lists...) The comment char could also appear in a character string, > where it does not mean the start of a comment... Yes, that would be the problem. Returning to my original post, modifying the example: x <- "print(2); bar <- \"don't ; use semicolons\"; foo <- '3;4'; ls(" This should result in a character vector of length 4: [1] "print(2)" "bar <- \"don't ; use semicolons\"" [3] "foo <- '3;4'" "ls(" even though the last command would cause an error using parse(text = x) Perhaps this is not that important (I am trying to simulate a normal R console), and parse only if it syntactically correct. I was merely curious if this could be done, likely using regular expressions (surely strsplit doesn't solve it). Best, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
On Sat, Sep 17, 2016 at 2:12 PM, David Winsemiuswrote: > > > Not entirely clear. If you were intending to just get character output then > you could just use: > > strsplit(txt, ";") > > If you wanted parsing to an R expression to occur you could pass through > sapply and get a full accounting of the syntactic deficit using `try`: > > sapply(strsplit( "print(2); ls(" , ";")[[1]] , function(t) > {try(parse(text=t))}) > Error in parse(text = t) : :2:0: unexpected end of input > 1: ls( >^ > expression(`print(2)` = print(2), ` ls(` = "Error in parse(text = t) : > :2:0: unexpected end of input\n1: ls(\n ^\n") > You would want to avoid splitting within character strings (print(";")) and in comments (print(2); ls() # This prints 2; then lists...) The comment char could also appear in a character string, where it does not mean the start of a comment... Not sure how to accomplish that using strsplit (or in general using just regular expressions). Peter __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
> On Sep 17, 2016, at 8:28 AM, Adrian Dușawrote: > > There is one minor problem with parse(): if any of the individual commands > has an error, the entire text will be parsed in a single error. > > For example, in a normal R console: > >> print(2); ls( > [1] 2 > + > > So first print(2) is executed, and only after the console expects the user > to continue the command from ls( > Parsing the same text: > >> parse(text = "print(2); ls(") > Error in parse(text = "print(2); ls(") : > :2:0: unexpected end of input > 1: print(2); ls( > ^ > > What I would need is something to separate the two commands, irrespective > of their syntactical correctness: > > [1] "print(2)" "ls(" > > I hope this explains the situation, Not entirely clear. If you were intending to just get character output then you could just use: strsplit(txt, ";") If you wanted parsing to an R expression to occur you could pass through sapply and get a full accounting of the syntactic deficit using `try`: sapply(strsplit( "print(2); ls(" , ";")[[1]] , function(t) {try(parse(text=t))}) Error in parse(text = t) : :2:0: unexpected end of input 1: ls( ^ expression(`print(2)` = print(2), ` ls(` = "Error in parse(text = t) : :2:0: unexpected end of input\n1: ls(\n ^\n") > Adrian > > On Thu, Sep 15, 2016 at 11:02 PM, Adrian Dușa wrote: > >> On Thu, Sep 15, 2016 at 10:28 PM, William Dunlap >> wrote: >> >>> The most reliable way to split such lines is with parse(text=x). >>> Regular expressions don't do well with context-free grammars. >>> >> >> Oh, that's right of course. >>> as.character(parse(text = x)) >> [1] "foo <- \"3;4\"""bar <- \"don't ; use semicolons\"" >> >> That was simple enough, thanks very much, >> Adrian >> >> -- >> Adrian Dusa >> University of Bucharest >> Romanian Social Data Archive >> Soseaua Panduri nr.90 >> 050663 Bucharest sector 5 >> Romania >> > > > > -- > Adrian Dusa > University of Bucharest > Romanian Social Data Archive > Soseaua Panduri nr.90 > 050663 Bucharest sector 5 > Romania > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
There is one minor problem with parse(): if any of the individual commands has an error, the entire text will be parsed in a single error. For example, in a normal R console: > print(2); ls( [1] 2 + So first print(2) is executed, and only after the console expects the user to continue the command from ls( Parsing the same text: > parse(text = "print(2); ls(") Error in parse(text = "print(2); ls(") : :2:0: unexpected end of input 1: print(2); ls( ^ What I would need is something to separate the two commands, irrespective of their syntactical correctness: [1] "print(2)" "ls(" I hope this explains the situation, Adrian On Thu, Sep 15, 2016 at 11:02 PM, Adrian Dușawrote: > On Thu, Sep 15, 2016 at 10:28 PM, William Dunlap > wrote: > >> The most reliable way to split such lines is with parse(text=x). >> Regular expressions don't do well with context-free grammars. >> > > Oh, that's right of course. > > as.character(parse(text = x)) > [1] "foo <- \"3;4\"""bar <- \"don't ; use semicolons\"" > > That was simple enough, thanks very much, > Adrian > > -- > Adrian Dusa > University of Bucharest > Romanian Social Data Archive > Soseaua Panduri nr.90 > 050663 Bucharest sector 5 > Romania > -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
On Thu, Sep 15, 2016 at 10:28 PM, William Dunlapwrote: > The most reliable way to split such lines is with parse(text=x). > Regular expressions don't do well with context-free grammars. > Oh, that's right of course. > as.character(parse(text = x)) [1] "foo <- \"3;4\"""bar <- \"don't ; use semicolons\"" That was simple enough, thanks very much, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] separate commands by semicolon
The most reliable way to split such lines is with parse(text=x). Regular expressions don't do well with context-free grammars. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Sep 15, 2016 at 12:08 PM, Adrian Dușawrote: > Dear R-helpers, > > When parsing a text, I would like to separate commands written on the same > line, by a semicolon. > Something like: > > x <- "foo <- '3;4'; bar <- \"don't ; use semicolons\"" > > Ideally, that would translate to these two commands in a character vector > of length 2: > foo <- '3;4' > bar <- "don't ; use semicolons" > > It's probably a regexp magic, but I just can't find it. > > Any hint is highly appreciated, > Adrian > > -- > Adrian Dusa > University of Bucharest > Romanian Social Data Archive > Soseaua Panduri nr.90 > 050663 Bucharest sector 5 > Romania > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.