Use \\. or [.] with quotes to denote a literal dot (#1) or can use fixed = TRUE to remove the meaning of dot (#2) or use a zero-width lookahead assertion (?=[.]) which will be matched but is not added to the string to be replaced (#3). Try ?regexpr . Also the links on the gsubfn home page (http://code.google.com/p/gsubfn/) point to a number of good resources on regular expressions.
Str <- c("y..m.", "BD..g.cm3.", "PR..Mpa.", "Ks..m.s.", "SP.g..g.", "P..m3.m3.", "theta1..g.g.", "theta2..g.g.", "AWC..g.g.") # 1 tmp <- gsub("[.]+", ".", Str) sub("[.]+$", "", tmp) # 2 tmp <- gsub("..", ".", Str, fixed = TRUE) sub("[.]+$", "", tmp) # 3 - both done at once using zero-width lookahead gsub("[.]*$|[.]*(?=[.])", "", Str, perl = TRUE) On 7/26/07, 8rino-Luca Pantani <[EMAIL PROTECTED]> wrote: > Dear R users, > I have the following two problems, related to the function sub, grep, > regexpr and similia. > > The header of the file(s) I have to import is like this. > > c("y (m)", "BD (g/cm3)", "PR (Mpa)", "Ks (m/s)", "SP g./g.", "P > (m3/m3)", "theta1 (g/g)", "theta2 (g/g)", "AWC (g/g)") > > To get rid of spaces and symbols in the names of the columns, > I use read.table(... check.names=TRUE) and I get: > str <- c("y..m.", "BD..g.cm3.", "PR..Mpa.", "Ks..m.s.", "SP.g..g.", > "P..m3.m3.", "theta1..g.g.", "theta2..g.g.", "AWC..g.g.") > > Now, my problem is to remove the trailing dots, as well as the double > dots, in order to get the names like the following > c("y.m", "BD.g.cm3", "PR.Mpa", "Ks.m.s", "SP.g.g", "P.m3.m3.", > "theta1.g.g", "theta2.g.g", "AWC.g.g") > > I've searched the help pages for sub, regexpr and similia, and also > searched the help archives. > I understand that the dot is a peculiar sign since > sub("..", ".", str) > [1] "..m." "...g.cm3." "...Mpa." "...m.s." "..g..g." > [6] "..m3.m3." ".eta1..g.g." ".eta2..g.g." ".C..g.g." > > Therefore I tried > sub("\\..", ".", str) > [1] "y.m." "BD.g.cm3." "PR.Mpa." "Ks.m.s." "SP...g." > [6] "P.m3.m3." "theta1.g.g." "theta2.g.g." "AWC.g.g." > and I've been surprised by the (to me) strange behaviour in "SP.g..g." > modified in "SP...g." > An this is the first problem I cannot solve. > > Then there's the problem of trailing dot removal. > In > http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8665.html > I've found a somewhat similar problem, but it do not works in this case > since: > gsub("[.].*", "", str) > [1] "y" "BD" "PR" "Ks" "SP" "P" "theta1" "theta2" > [9] "AWC" > And this the second problem > > Apart this particular problems I would like to know more on regexp, sub > and so on, since each time > I have strings to manipulate, I must face my ignorance in the topic of > regular expression and its syntax. > > Is there any page with examples, where I can improve my knowledge and > stop being frustrated each time I have to manipulate strings? > > 8rino > > -- > Ottorino-Luca Pantani, Università di Firenze > Dip. Scienza del Suolo e Nutrizione della Pianta > P.zle Cascine 28 50144 Firenze Italia > Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 > [EMAIL PROTECTED] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.