Trying to make it work when not all rows have the same numbers of fields seems like a good place to use the "flush" argument to scan() (to skip everything after the first field on the line):

With the following copied to the clipboard:

i1-apple        10$   New_York
i3-strawberry   7$    Japan


> scan("clipboard", "", flush=T)
Read 3 items
[1] "i1-apple"      "i2-banana"     "i3-strawberry"
> sub("^[A-Za-z0-9]*-", "", scan("clipboard", "", flush=T))
Read 3 items
[1] "apple"      "banana"     "strawberry"

-- Tony Plate

At Monday 01:59 PM 11/1/2004, Spencer Graves wrote:
Uwe and Andy's solutions are great for many applications but won't work if not all rows have the same numbers of fields. Consider for example the following modification of Lee's example:
i1-apple 10$ New_York
i3-strawberry 7$ Japan

If I copy this to "clipboard" and run Andy's code, I get the following:
> read.table("clipboard", colClasses=c("character", "NULL", "NULL"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 2 did not have 3 elements

We can get around this using "scan", then splitting things apart similar to the way Uwe described:
> dat <-
+ scan("clipboard", character(0), sep="\n")
Read 3 items
> dash <- regexpr("-", dat)
> dat2 <- substring(dat, pmax(0, dash)+1)
> blank <- regexpr(" ", dat2)
> if(any(blank<0))
+ blank[blank<0] <- nchar(dat2[blank<0])
> substring(dat2, 1, blank)
[1] "apple " "banana" "strawberry "

     hope this helps.  spencer graves

Uwe Ligges wrote:

Liaw, Andy wrote:

Using R-2.0.0 on WinXPPro, cut-and-pasting the data you have:

read.table("clipboard", colClasses=c("character", "NULL", "NULL"))

1      i1-apple
2     i2-banana
3 i3-strawberry

... and if only the words after "-" are of interest, the statement can be followed by

 sapply(strsplit(...., "-"), "[", 2)

Uwe Ligges


From: j lee

Hello All,

I'd like to read first words in lines into a new file.
If I have a data file the following, how can I get the
first words: apple, banana, strawberry?

i1-apple        10$   New_York
i2-banana       5$    London
i3-strawberry   7$    Japan

Is there any similar question already posted to the
list? I am a bit new to R, having a few months of
experience now.



