I have a text processing problem I'm hoping someone can help me solve. This
issue it this.
I have a character string in which I need to delete a variable number of
characters from the string. The string itself contains the number of
characters to be deleted. The number of characters to be deleted is proceeded
by either a "+" or a "-".
A toy example:
Suppose I have
x<-c("A-1CB-2GHX", "*+11gAgggTgtgggH")
> x
[1] "A-1CB-2GHX" "*+11gAgggTgtgggH"
What I need as output is
"ABX" "*H"
I know I can use gsub to remove the control character and the number portion
with
gsub("(\\-|\\+)([0-9]+)", replacement="", x)
However, I can't figure out how to delete the variable number of characters
after the number portion of the string.
Any ideas?
In case this helps
> sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-pc-mingw32
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
Brian
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.