Re: [R] Looking for simple line-splitting code
Hi, using the library "stringi" allows this: > unlist(stringr::str_split(x, "\n")) [1] "abc" "def" """ghi" Best, Kimmo ke, 2025-02-05 kello 09:35 -0500, Duncan Murdoch kirjoitti: > Thanks to Rui, Peter and Tanvir! Peter's seems to be the fastest of > the > 3 suggestions so far on the little test case, but on the real data > (where x contains several thousand lines), Rui's seems best. > > Duncan > > On 2025-02-05 9:13 a.m., peter dalgaard wrote: > > This also seems to work: > > > > > strsplit(paste(x,collapse="\n"),"\n")[[1]] > > [1] "abc" "def" """ghi" > > > > > > > On 5 Feb 2025, at 14:44 , Duncan Murdoch > > > wrote: > > > > > > If I have this object: > > > > > > x <- c("abc\ndef", "", "ghi") > > > > > > and I write it to a file using `writeLines(x, "test.txt")`, my > > > text editor sees a 5 line file: > > > > > > 1: abc > > > 2: def > > > 3: > > > 4: ghi > > > 5: > > > > > > which is what I'd expect: the last line in the editor is empty. > > > If I use `readLines("test.txt")` on that file, I get the vector > > > > > > c("abc", "def", "", "ghi") > > > > > > and all of that is fine. > > > > > > What I'm looking for is simple code that modifies x to the > > > `readLines()` output, without actually writing and reading it. > > > > > > My first attempt doesn't work: > > > > > > unlist(strsplit(x, "\n")) > > > > > > because it leaves out the blank line 3. I can fix that with this > > > ugly code: > > > > > > lines <- strsplit(x, "\n") > > > lines[sapply(lines, length) == 0] <- list("") > > > lines <- unlist(lines) > > > > > > Surely there's a simpler way to do this? I'd like to use just > > > base functions, no other packages. > > > > > > Duncan Murdoch > > > > > > __ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > https://www.r-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible > > > code. > > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > https://www.r-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
On 2025-02-05 5:20 p.m., Rolf Turner wrote: On Wed, 5 Feb 2025 08:44:12 -0500 Duncan Murdoch wrote: If I have this object: x <- c("abc\ndef", "", "ghi") and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file: 1: abc 2: def 3: 4: ghi 5: which is what I'd expect: the last line in the editor is empty.If I use `readLines("test.txt")` on that file, I get the vector c("abc", "def", "", "ghi") Apologies for muddying the waters with my ignorance, but why would you expect a 5 line file? I would expect a 4 line file, and that is indeed what I get when I try the code in question. If I append an empty line at the end of my test.txt and then apply readLines() to that file, I get [1] "abc" "def" """ghi" "", again as *I* would expect. What am I missing? Sorry for being a thicko. That fifth line is more a function of the editor (RStudio) than the file. The last line ends with a newline. The editor shows this by including a blank 5th line. If the last line just stopped at the last letter, the editor would display it as a 4 line file. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
Hello, Inline. Às 20:27 de 05/02/2025, peter dalgaard escreveu: A 3rd option could be scan(text=x, what="", blank.lines.skip=FALSE) (all because readLines() doesn't obey the text=x convention, perhaps it should? I'm unsure whether the textConnection is left open in Rui's method.) No, it is not left open. Use ?showConnections to check it. x <- c("abc\ndef", "", "ghi") x |> textConnection() |> readLines() showConnections() # this creates and opens a connection tc <- textConnection(x) showConnections() readLines(tc) showConnections() close(tc) showConnections() Hope this helps, Rui Barradas -pd On 5 Feb 2025, at 15:35 , Duncan Murdoch wrote: Thanks to Rui, Peter and Tanvir! Peter's seems to be the fastest of the 3 suggestions so far on the little test case, but on the real data (where x contains several thousand lines), Rui's seems best. Duncan On 2025-02-05 9:13 a.m., peter dalgaard wrote: This also seems to work: strsplit(paste(x,collapse="\n"),"\n")[[1]] [1] "abc" "def" """ghi" On 5 Feb 2025, at 14:44 , Duncan Murdoch wrote: If I have this object: x <- c("abc\ndef", "", "ghi") and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file: 1: abc 2: def 3: 4: ghi 5: which is what I'd expect: the last line in the editor is empty. If I use `readLines("test.txt")` on that file, I get the vector c("abc", "def", "", "ghi") and all of that is fine. What I'm looking for is simple code that modifies x to the `readLines()` output, without actually writing and reading it. My first attempt doesn't work: unlist(strsplit(x, "\n")) because it leaves out the blank line 3. I can fix that with this ugly code: lines <- strsplit(x, "\n") lines[sapply(lines, length) == 0] <- list("") lines <- unlist(lines) Surely there's a simpler way to do this? I'd like to use just base functions, no other packages. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
On Wed, 5 Feb 2025 08:44:12 -0500 Duncan Murdoch wrote: > If I have this object: > >x <- c("abc\ndef", "", "ghi") > > and I write it to a file using `writeLines(x, "test.txt")`, my text > editor sees a 5 line file: > >1: abc >2: def >3: >4: ghi >5: > > which is what I'd expect: the last line in the editor is empty.If I > use `readLines("test.txt")` on that file, I get the vector > >c("abc", "def", "", "ghi") Apologies for muddying the waters with my ignorance, but why would you expect a 5 line file? I would expect a 4 line file, and that is indeed what I get when I try the code in question. If I append an empty line at the end of my test.txt and then apply readLines() to that file, I get [1] "abc" "def" """ghi" "", again as *I* would expect. What am I missing? Sorry for being a thicko. cheers, Rolf -- Honorary Research Fellow Department of Statistics University of Auckland Stats. Dep't. (secretaries) phone:readLines("test.txt") +64-9-373-7599 ext. 89622 Home phone: +64-9-480-4619 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
A 3rd option could be scan(text=x, what="", blank.lines.skip=FALSE) (all because readLines() doesn't obey the text=x convention, perhaps it should? I'm unsure whether the textConnection is left open in Rui's method.) -pd > On 5 Feb 2025, at 15:35 , Duncan Murdoch wrote: > > Thanks to Rui, Peter and Tanvir! Peter's seems to be the fastest of the 3 > suggestions so far on the little test case, but on the real data (where x > contains several thousand lines), Rui's seems best. > > Duncan > > On 2025-02-05 9:13 a.m., peter dalgaard wrote: >> This also seems to work: >>> strsplit(paste(x,collapse="\n"),"\n")[[1]] >> [1] "abc" "def" """ghi" >>> On 5 Feb 2025, at 14:44 , Duncan Murdoch wrote: >>> >>> If I have this object: >>> >>> x <- c("abc\ndef", "", "ghi") >>> >>> and I write it to a file using `writeLines(x, "test.txt")`, my text editor >>> sees a 5 line file: >>> >>> 1: abc >>> 2: def >>> 3: >>> 4: ghi >>> 5: >>> >>> which is what I'd expect: the last line in the editor is empty. If I use >>> `readLines("test.txt")` on that file, I get the vector >>> >>> c("abc", "def", "", "ghi") >>> >>> and all of that is fine. >>> >>> What I'm looking for is simple code that modifies x to the `readLines()` >>> output, without actually writing and reading it. >>> >>> My first attempt doesn't work: >>> >>> unlist(strsplit(x, "\n")) >>> >>> because it leaves out the blank line 3. I can fix that with this ugly code: >>> >>> lines <- strsplit(x, "\n") >>> lines[sapply(lines, length) == 0] <- list("") >>> lines <- unlist(lines) >>> >>> Surely there's a simpler way to do this? I'd like to use just base >>> functions, no other packages. >>> >>> Duncan Murdoch >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> https://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
Thanks to Rui, Peter and Tanvir! Peter's seems to be the fastest of the 3 suggestions so far on the little test case, but on the real data (where x contains several thousand lines), Rui's seems best. Duncan On 2025-02-05 9:13 a.m., peter dalgaard wrote: This also seems to work: strsplit(paste(x,collapse="\n"),"\n")[[1]] [1] "abc" "def" """ghi" On 5 Feb 2025, at 14:44 , Duncan Murdoch wrote: If I have this object: x <- c("abc\ndef", "", "ghi") and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file: 1: abc 2: def 3: 4: ghi 5: which is what I'd expect: the last line in the editor is empty. If I use `readLines("test.txt")` on that file, I get the vector c("abc", "def", "", "ghi") and all of that is fine. What I'm looking for is simple code that modifies x to the `readLines()` output, without actually writing and reading it. My first attempt doesn't work: unlist(strsplit(x, "\n")) because it leaves out the blank line 3. I can fix that with this ugly code: lines <- strsplit(x, "\n") lines[sapply(lines, length) == 0] <- list("") lines <- unlist(lines) Surely there's a simpler way to do this? I'd like to use just base functions, no other packages. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
x <- c("abc\ndef", "", "ghi") unlist(strsplit(gsub("^$", "\n", x), "\n")) or x %>% gsub("^$", "\n", .) %>% strsplit("\n") %>% unlist() Regards. Tanvir Ahamed Stockholm, Sweden | mashra...@yahoo.com On Wednesday, February 5, 2025 at 02:44:37 PM GMT+1, Duncan Murdoch wrote: If I have this object: x <- c("abc\ndef", "", "ghi") and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file: 1: abc 2: def 3: 4: ghi 5: which is what I'd expect: the last line in the editor is empty. If I use `readLines("test.txt")` on that file, I get the vector c("abc", "def", "", "ghi") and all of that is fine. What I'm looking for is simple code that modifies x to the `readLines()` output, without actually writing and reading it. My first attempt doesn't work: unlist(strsplit(x, "\n")) because it leaves out the blank line 3. I can fix that with this ugly code: lines <- strsplit(x, "\n") lines[sapply(lines, length) == 0] <- list("") lines <- unlist(lines) Surely there's a simpler way to do this? I'd like to use just base functions, no other packages. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
This also seems to work: > strsplit(paste(x,collapse="\n"),"\n")[[1]] [1] "abc" "def" """ghi" > On 5 Feb 2025, at 14:44 , Duncan Murdoch wrote: > > If I have this object: > > x <- c("abc\ndef", "", "ghi") > > and I write it to a file using `writeLines(x, "test.txt")`, my text editor > sees a 5 line file: > > 1: abc > 2: def > 3: > 4: ghi > 5: > > which is what I'd expect: the last line in the editor is empty. If I use > `readLines("test.txt")` on that file, I get the vector > > c("abc", "def", "", "ghi") > > and all of that is fine. > > What I'm looking for is simple code that modifies x to the `readLines()` > output, without actually writing and reading it. > > My first attempt doesn't work: > > unlist(strsplit(x, "\n")) > > because it leaves out the blank line 3. I can fix that with this ugly code: > > lines <- strsplit(x, "\n") > lines[sapply(lines, length) == 0] <- list("") > lines <- unlist(lines) > > Surely there's a simpler way to do this? I'd like to use just base > functions, no other packages. > > Duncan Murdoch > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looking for simple line-splitting code
Às 13:44 de 05/02/2025, Duncan Murdoch escreveu: If I have this object: x <- c("abc\ndef", "", "ghi") and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees a 5 line file: 1: abc 2: def 3: 4: ghi 5: which is what I'd expect: the last line in the editor is empty. If I use `readLines("test.txt")` on that file, I get the vector c("abc", "def", "", "ghi") and all of that is fine. What I'm looking for is simple code that modifies x to the `readLines()` output, without actually writing and reading it. My first attempt doesn't work: unlist(strsplit(x, "\n")) because it leaves out the blank line 3. I can fix that with this ugly code: lines <- strsplit(x, "\n") lines[sapply(lines, length) == 0] <- list("") lines <- unlist(lines) Surely there's a simpler way to do this? I'd like to use just base functions, no other packages. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Use ?textConnection. The 5th line is left out, just like in your code. x <- c("abc\ndef", "", "ghi") x |> textConnection() |> readLines() # [1] "abc" "def" """ghi" Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide https://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.