Hello all,
Let's say I have a character string
"Race-ethnicity-----coding information"
I want to extract all text before the multiple dashes, including the word
"ethnicity."
I wrote a handy function to extract the first matched text:
grepcut <- function(pattern,x){
start.and.length <- regexpr(pattern,x)
substring(x,start.and.length,start.and.length
+attr(start.and.length,"match.length")-1)}
grepcut("^[^-]+","Race-ethnicity-----coding information")
The above grepcut, of course, returns only the string "Race" What I really
want is a to create a class of two dashes in a row and then negate that. Is
it possible to create a class of repeated characters? If so, it might be
further complicated that "-" is a special character in brackets and can only
go first or last.
Can anyone help me out?
Thanks,
Michael Young
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.