And here is an alternative to the regular expressions (although again I don't think you really need any of this):
> capture.output(dput(strsplit("col1 col2 col3", " ")[[1]])) [1] "c(\"col1\", \"col2\", \"col3\")" On 1/30/07, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > Both spaces and tabs are whitespace so this > should be good enough (unless you can > have empty fields): > > read.table("myfile.dat", header = TRUE) > > See the sep= argument in ?read.table . > > Although I don't think you really need this, here are > some regular expressions for processing a header > into the form you asked for. The first line places > quotes around the names, the second one inserts > commas and the last one adds c( and ). > > s <- gsub('(\\S+)', '"\\1"', 'col1 col2 col3') > s <- gsub("(\\S+) ", "\\1, ", s) > sub("(.*)", "c(\\1)", s) > > > On 1/30/07, Kimpel, Mark William <[EMAIL PROTECTED]> wrote: > > The main problem I am trying to solve it this: > > > > I am importing a tab delimited file whose first line contains only one > > column, which is a descriptor of the form "col_1 col_2 col_3", i.e. the > > colnames are not tab delineated but are separated by whitespace. I would > > like to parse this first line and make such that it becomes the colnames > > of the rest of the file, which I am reading into R using read.delim(). > > The file is so huge that I must do this in R. > > > > My first question is this: What is the best way to accomplish what I > > want to do? > > > > My other questions revolve around some failed attempts on my part to > > solve the problem on my own using regular expressions. I thought that > > perhaps I could change the first line to "c("col_1", "col_2", "col_3") > > using gsub. I was having trouble figuring out how R uses the backslash > > character because I know that sometimes the backslash one would use in > > Perl needs to be a double backslash in R. > > > > Here is a sample of what I tried and what I got: > > > > a<-"col_1 col_2 col_3" > > > > > gsub("\\s", " " , a) > > > > [1] "col_1 col_2 col_3" > > > > > gsub("\\s", "\\s" , a) > > > > [1] "col_1scol_2scol_3" > > > > As you can see, it looks like R is taking a regular expression for > > "pattern", but not taking it for "replacement". Why is this? > > > > Assuming that I did want to solve my original problem with gsub and then > > turn the string into an R object, how would I get gsub to return > > "c("col_1", "col_2", "col_3") using my original string? > > > > Finally, is there a way to declare a string as a regular expression so > > that R sees it the same way other languages, such as Perl do, i.e. make > > the backslash be interpreted the same way? For someone who is just > > learning regular expressions as I am, it is very frustrating to read > > about them in references and then have to translate what I've learned > > into R syntax. I was thinking that instead of enclosing the string in > > "", one could use THIS.IS.A.REGULAR.EXPRESSION(), similar to the way we > > use I() in formulae. > > > > These are a bunch of questions, but obviously I have a lot to learn! > > > > Thanks, > > > > Mark > > > > Mark W. Kimpel MD > > > > > > > > (317) 490-5129 Work, & Mobile > > > > > > > > (317) 663-0513 Home (no voice mail please) > > > > 1-(317)-536-2730 FAX > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.