The first line in the body of the function splits the input, s, 
using the separator and makes it a list.

The second line and third lines define a regexp which matches 
leading and trailing whitespace and define a logical vector which
selects the odd positioned elements.

Since we know that the odd positioned elements are not between
quotes, in the fourth line we remove any leading and 
trailing whitespace from them and split them on whitespace.

In the fifth line we convert s from list to vector.

test <- function(s,sep="'") {
   s <- lapply(strsplit(s,sep)[[1]],c)
   re <- "^[[:space:]]+|[[:space:]]+$"
   odd <- seq(along=s)%%2 == 1
   s[odd] <- strsplit( gsub(re,"",unlist(s)[odd]), "[[:space:]]+" )
   unlist(s)
}

test(line)
test(bad.line)



---
From: Corey Moffet <[EMAIL PROTECTED]>
 
Hello all,

I am trying to solve a problem, and my solution is rather ugly and not very
general. The posts for "[R] help with gsub and grep functions" seemed
relevent
and gave me hope for a more refined and more general solution.

The Problem:

line <- "'this text has spaces' 'thisNot' 3 4 5 6 7 8 9 10"
bad.line <- "'this text has spaces' thisNot 3 4 5 6 7 8 9 10"

The desired result of a process on 'line' or "bad.line":

> parts <- some.function(line)

> parts
[1] "this text has spaces"
[2] "thisNot"
[3] "3"
[4] "4"
[5] "5"
[6] "6"
[7] "7"
[8] "8"
[9] "9"
[10] "10"

Current function to obtain a solution for "line" but not "bad.line":

"some.function" <- function(line, quote.char = "'") {
quoted <- unlist(strsplit(line, quote.char))
quoted <- quoted[quoted != ""]
first <- quoted[1]
second <- quoted[3]
last <- quoted[4]
last.parts <-unlist(strsplit(last, " "))
last.parts <- last.parts[last.parts != ""]
out <- c(first, second, last.parts)
return(out)
}

This solution is not very good because the text parts of "line" are not 
required to be enclosed in quotations unless it has a space. All the files
I currently have to process have the first two pieces enclose in "'". But
it is future files that I worry about. Is there an existing function that
I have overlooked that splits strings, ignoring the delimiter when it is
enclosed in quotes? I know that I can do some testing on the length of
"quoted" in function "some.function" but it seems there should be a more
elegent way of doing this type of thing. Any suggestions?

With best wishes and kind regards I am



_______________________________________________
No banners. No pop-ups. No kidding.
Introducing My Way - http://www.myway.com

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to