Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Kimmo Elo
Hi,

using the library "stringi" allows this:

> unlist(stringr::str_split(x, "\n"))
[1] "abc" "def" """ghi"

Best,
Kimmo

ke, 2025-02-05 kello 09:35 -0500, Duncan Murdoch kirjoitti:
> Thanks to Rui, Peter and Tanvir!  Peter's seems to be the fastest of
> the
> 3 suggestions so far on the little test case, but on the real data
> (where x contains several thousand lines), Rui's seems best.
>
> Duncan
>
> On 2025-02-05 9:13 a.m., peter dalgaard wrote:
> > This also seems to work:
> >
> > > strsplit(paste(x,collapse="\n"),"\n")[[1]]
> > [1] "abc" "def" """ghi"
> >
> >
> > > On 5 Feb 2025, at 14:44 , Duncan Murdoch
> > >  wrote:
> > >
> > > If I have this object:
> > >
> > >   x <- c("abc\ndef", "", "ghi")
> > >
> > > and I write it to a file using `writeLines(x, "test.txt")`, my
> > > text editor sees a 5 line file:
> > >
> > >   1: abc
> > >   2: def
> > >   3:
> > >   4: ghi
> > >   5:
> > >
> > > which is what I'd expect:  the last line in the editor is empty.
> > > If I use `readLines("test.txt")` on that file, I get the vector
> > >
> > >   c("abc", "def", "", "ghi")
> > >
> > > and all of that is fine.
> > >
> > > What I'm looking for is simple code that modifies x to the
> > > `readLines()` output, without actually writing and reading it.
> > >
> > > My first attempt doesn't work:
> > >
> > >   unlist(strsplit(x, "\n"))
> > >
> > > because it leaves out the blank line 3.  I can fix that with this
> > > ugly code:
> > >
> > >   lines <- strsplit(x, "\n")
> > >   lines[sapply(lines, length) == 0] <- list("")
> > >   lines <- unlist(lines)
> > >
> > > Surely there's a simpler way to do this?  I'd like to use just
> > > base functions, no other packages.
> > >
> > > Duncan Murdoch
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > https://www.r-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible
> > > code.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.r-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Duncan Murdoch

On 2025-02-05 5:20 p.m., Rolf Turner wrote:


On Wed, 5 Feb 2025 08:44:12 -0500
Duncan Murdoch  wrote:


If I have this object:

x <- c("abc\ndef", "", "ghi")

and I write it to a file using `writeLines(x, "test.txt")`, my text
editor sees a 5 line file:

1: abc
2: def
3:
4: ghi
5:

which is what I'd expect:  the last line in the editor is empty.If I
use `readLines("test.txt")` on that file, I get the vector

c("abc", "def", "", "ghi")





Apologies for muddying the waters with my ignorance, but why would you
expect a 5 line file?   I would expect a 4 line file, and that is
indeed what I get when I try the code in question.

If I append an empty line at the end of my test.txt and then
apply readLines() to that file, I get [1] "abc" "def" """ghi" "",
again as *I* would expect.

What am I missing?  Sorry for being a thicko.



That fifth line is more a function of the editor (RStudio) than the 
file.  The last line ends with a newline.  The editor shows this by 
including a blank 5th line.  If the last line just stopped at the last 
letter, the editor would display it as a 4 line file.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Rui Barradas

Hello,

Inline.

Às 20:27 de 05/02/2025, peter dalgaard escreveu:

A 3rd option could be

scan(text=x, what="", blank.lines.skip=FALSE)

(all because readLines() doesn't obey the text=x convention, perhaps it should? 
I'm unsure whether the textConnection is left open in Rui's method.)


No, it is not left open. Use ?showConnections to check it.


x <- c("abc\ndef", "", "ghi")

x |> textConnection() |> readLines()
showConnections()

# this creates and opens a connection
tc <- textConnection(x)
showConnections()
readLines(tc)
showConnections()
close(tc)
showConnections()


Hope this helps,

Rui Barradas



-pd


On 5 Feb 2025, at 15:35 , Duncan Murdoch  wrote:

Thanks to Rui, Peter and Tanvir!  Peter's seems to be the fastest of the 3 
suggestions so far on the little test case, but on the real data (where x 
contains several thousand lines), Rui's seems best.

Duncan

On 2025-02-05 9:13 a.m., peter dalgaard wrote:

This also seems to work:

strsplit(paste(x,collapse="\n"),"\n")[[1]]

[1] "abc" "def" """ghi"

On 5 Feb 2025, at 14:44 , Duncan Murdoch  wrote:

If I have this object:

  x <- c("abc\ndef", "", "ghi")

and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees 
a 5 line file:

  1: abc
  2: def
  3:
  4: ghi
  5:

which is what I'd expect:  the last line in the editor is empty.  If I use 
`readLines("test.txt")` on that file, I get the vector

  c("abc", "def", "", "ghi")

and all of that is fine.

What I'm looking for is simple code that modifies x to the `readLines()` 
output, without actually writing and reading it.

My first attempt doesn't work:

  unlist(strsplit(x, "\n"))

because it leaves out the blank line 3.  I can fix that with this ugly code:

  lines <- strsplit(x, "\n")
  lines[sapply(lines, length) == 0] <- list("")
  lines <- unlist(lines)

Surely there's a simpler way to do this?  I'd like to use just base functions, 
no other packages.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
www.avg.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Rolf Turner


On Wed, 5 Feb 2025 08:44:12 -0500
Duncan Murdoch  wrote:

> If I have this object:
> 
>x <- c("abc\ndef", "", "ghi")
> 
> and I write it to a file using `writeLines(x, "test.txt")`, my text 
> editor sees a 5 line file:
> 
>1: abc
>2: def
>3:
>4: ghi
>5:
> 
> which is what I'd expect:  the last line in the editor is empty.If I 
> use `readLines("test.txt")` on that file, I get the vector
> 
>c("abc", "def", "", "ghi")




Apologies for muddying the waters with my ignorance, but why would you
expect a 5 line file?   I would expect a 4 line file, and that is
indeed what I get when I try the code in question.

If I append an empty line at the end of my test.txt and then
apply readLines() to that file, I get [1] "abc" "def" """ghi" "",
again as *I* would expect.

What am I missing?  Sorry for being a thicko.

cheers,

Rolf

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Stats. Dep't. (secretaries) phone:readLines("test.txt")
 +64-9-373-7599 ext. 89622
Home phone: +64-9-480-4619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread peter dalgaard
A 3rd option could be

scan(text=x, what="", blank.lines.skip=FALSE)

(all because readLines() doesn't obey the text=x convention, perhaps it should? 
I'm unsure whether the textConnection is left open in Rui's method.)

-pd

> On 5 Feb 2025, at 15:35 , Duncan Murdoch  wrote:
> 
> Thanks to Rui, Peter and Tanvir!  Peter's seems to be the fastest of the 3 
> suggestions so far on the little test case, but on the real data (where x 
> contains several thousand lines), Rui's seems best.
> 
> Duncan
> 
> On 2025-02-05 9:13 a.m., peter dalgaard wrote:
>> This also seems to work:
>>> strsplit(paste(x,collapse="\n"),"\n")[[1]]
>> [1] "abc" "def" """ghi"
>>> On 5 Feb 2025, at 14:44 , Duncan Murdoch  wrote:
>>> 
>>> If I have this object:
>>> 
>>>  x <- c("abc\ndef", "", "ghi")
>>> 
>>> and I write it to a file using `writeLines(x, "test.txt")`, my text editor 
>>> sees a 5 line file:
>>> 
>>>  1: abc
>>>  2: def
>>>  3:
>>>  4: ghi
>>>  5:
>>> 
>>> which is what I'd expect:  the last line in the editor is empty.  If I use 
>>> `readLines("test.txt")` on that file, I get the vector
>>> 
>>>  c("abc", "def", "", "ghi")
>>> 
>>> and all of that is fine.
>>> 
>>> What I'm looking for is simple code that modifies x to the `readLines()` 
>>> output, without actually writing and reading it.
>>> 
>>> My first attempt doesn't work:
>>> 
>>>  unlist(strsplit(x, "\n"))
>>> 
>>> because it leaves out the blank line 3.  I can fix that with this ugly code:
>>> 
>>>  lines <- strsplit(x, "\n")
>>>  lines[sapply(lines, length) == 0] <- list("")
>>>  lines <- unlist(lines)
>>> 
>>> Surely there's a simpler way to do this?  I'd like to use just base 
>>> functions, no other packages.
>>> 
>>> Duncan Murdoch
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> https://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Duncan Murdoch
Thanks to Rui, Peter and Tanvir!  Peter's seems to be the fastest of the 
3 suggestions so far on the little test case, but on the real data 
(where x contains several thousand lines), Rui's seems best.


Duncan

On 2025-02-05 9:13 a.m., peter dalgaard wrote:

This also seems to work:


strsplit(paste(x,collapse="\n"),"\n")[[1]]

[1] "abc" "def" """ghi"



On 5 Feb 2025, at 14:44 , Duncan Murdoch  wrote:

If I have this object:

  x <- c("abc\ndef", "", "ghi")

and I write it to a file using `writeLines(x, "test.txt")`, my text editor sees 
a 5 line file:

  1: abc
  2: def
  3:
  4: ghi
  5:

which is what I'd expect:  the last line in the editor is empty.  If I use 
`readLines("test.txt")` on that file, I get the vector

  c("abc", "def", "", "ghi")

and all of that is fine.

What I'm looking for is simple code that modifies x to the `readLines()` 
output, without actually writing and reading it.

My first attempt doesn't work:

  unlist(strsplit(x, "\n"))

because it leaves out the blank line 3.  I can fix that with this ugly code:

  lines <- strsplit(x, "\n")
  lines[sapply(lines, length) == 0] <- list("")
  lines <- unlist(lines)

Surely there's a simpler way to do this?  I'd like to use just base functions, 
no other packages.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Mohammad Tanvir Ahamed via R-help
x <- c("abc\ndef", "", "ghi")

unlist(strsplit(gsub("^$", "\n", x), "\n"))

or 

x %>%
  gsub("^$", "\n", .) %>%
  strsplit("\n") %>%
  unlist()


Regards.
Tanvir Ahamed 
Stockholm, Sweden |  mashra...@yahoo.com 

On Wednesday, February 5, 2025 at 02:44:37 PM GMT+1, Duncan Murdoch 
 wrote: 

If I have this object:

  x <- c("abc\ndef", "", "ghi")

and I write it to a file using `writeLines(x, "test.txt")`, my text 
editor sees a 5 line file:

  1: abc
  2: def
  3:
  4: ghi
  5:

which is what I'd expect:  the last line in the editor is empty.  If I 
use `readLines("test.txt")` on that file, I get the vector

  c("abc", "def", "", "ghi")

and all of that is fine.

What I'm looking for is simple code that modifies x to the `readLines()` 
output, without actually writing and reading it.

My first attempt doesn't work:

  unlist(strsplit(x, "\n"))

because it leaves out the blank line 3.  I can fix that with this ugly code:

  lines <- strsplit(x, "\n")
  lines[sapply(lines, length) == 0] <- list("")
  lines <- unlist(lines)

Surely there's a simpler way to do this?  I'd like to use just base 
functions, no other packages.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread peter dalgaard
This also seems to work:

> strsplit(paste(x,collapse="\n"),"\n")[[1]]
[1] "abc" "def" """ghi"


> On 5 Feb 2025, at 14:44 , Duncan Murdoch  wrote:
> 
> If I have this object:
> 
>  x <- c("abc\ndef", "", "ghi")
> 
> and I write it to a file using `writeLines(x, "test.txt")`, my text editor 
> sees a 5 line file:
> 
>  1: abc
>  2: def
>  3:
>  4: ghi
>  5:
> 
> which is what I'd expect:  the last line in the editor is empty.  If I use 
> `readLines("test.txt")` on that file, I get the vector
> 
>  c("abc", "def", "", "ghi")
> 
> and all of that is fine.
> 
> What I'm looking for is simple code that modifies x to the `readLines()` 
> output, without actually writing and reading it.
> 
> My first attempt doesn't work:
> 
>  unlist(strsplit(x, "\n"))
> 
> because it leaves out the blank line 3.  I can fix that with this ugly code:
> 
>  lines <- strsplit(x, "\n")
>  lines[sapply(lines, length) == 0] <- list("")
>  lines <- unlist(lines)
> 
> Surely there's a simpler way to do this?  I'd like to use just base 
> functions, no other packages.
> 
> Duncan Murdoch
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Looking for simple line-splitting code

2025-02-05 Thread Rui Barradas

Às 13:44 de 05/02/2025, Duncan Murdoch escreveu:

If I have this object:

   x <- c("abc\ndef", "", "ghi")

and I write it to a file using `writeLines(x, "test.txt")`, my text 
editor sees a 5 line file:


   1: abc
   2: def
   3:
   4: ghi
   5:

which is what I'd expect:  the last line in the editor is empty.  If I 
use `readLines("test.txt")` on that file, I get the vector


   c("abc", "def", "", "ghi")

and all of that is fine.

What I'm looking for is simple code that modifies x to the `readLines()` 
output, without actually writing and reading it.


My first attempt doesn't work:

   unlist(strsplit(x, "\n"))

because it leaves out the blank line 3.  I can fix that with this ugly 
code:


   lines <- strsplit(x, "\n")
   lines[sapply(lines, length) == 0] <- list("")
   lines <- unlist(lines)

Surely there's a simpler way to do this?  I'd like to use just base 
functions, no other packages.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting- 
guide.html

and provide commented, minimal, self-contained, reproducible code.

Hello,

Use ?textConnection.
The 5th line is left out, just like in your code.


x <- c("abc\ndef", "", "ghi")
x |> textConnection() |> readLines()
# [1] "abc" "def" """ghi"


Hope this helps,

Rui Barradas



--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
www.avg.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Looking for simple line-splitting code

2025-02-05 Thread Duncan Murdoch

If I have this object:

  x <- c("abc\ndef", "", "ghi")

and I write it to a file using `writeLines(x, "test.txt")`, my text 
editor sees a 5 line file:


  1: abc
  2: def
  3:
  4: ghi
  5:

which is what I'd expect:  the last line in the editor is empty.  If I 
use `readLines("test.txt")` on that file, I get the vector


  c("abc", "def", "", "ghi")

and all of that is fine.

What I'm looking for is simple code that modifies x to the `readLines()` 
output, without actually writing and reading it.


My first attempt doesn't work:

  unlist(strsplit(x, "\n"))

because it leaves out the blank line 3.  I can fix that with this ugly code:

  lines <- strsplit(x, "\n")
  lines[sapply(lines, length) == 0] <- list("")
  lines <- unlist(lines)

Surely there's a simpler way to do this?  I'd like to use just base 
functions, no other packages.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.