Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-31 Thread Jeff Newmiller
There are many resources for learning regular expressions (e.g. http://gnosis.cx/publish/programming/regular_expressions.html). Once you understand the basics you will probably be able to refer to the ?regex help page for specific tools. After you have waded through a tutorial, the following

Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-31 Thread Fabrice Tourre
0 and 1 means zero or 1 match. Want to remove the word Energy? gsub(( Energy){0,1},{0,1} Inc[.]{0,1}, , DF) On Thu, May 31, 2012 at 11:45 AM, mdvaan mathijsdev...@gmail.com wrote: Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page

Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-31 Thread mdvaan
Thank you very much. This definitely helps me out. Math Jeff Newmiller wrote There are many resources for learning regular expressions (e.g. http://gnosis.cx/publish/programming/regular_expressions.html). Once you understand the basics you will probably be able to refer to the ?regex help

[R] gsub/strsplit with multiple patterns/splits

2012-05-30 Thread mdvaan
Hi, I have a vector like this: DF - c(Aetna, Inc., Alexander's Inc., Allegheny Energy, Inc) For each element in the vector I would like to remove the incorporated info, so that my vector looks like this: DF - c(Aetna, Alexander's, Allegheny Energy) That means that I have to strip: strip -

Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-30 Thread jim holtman
Try this where you qualify how many characters you might match: gsub(,{0,1} Inc[.]{0,1}, , DF) [1] AetnaAlexander's Allegheny Energy On Wed, May 30, 2012 at 6:05 PM, mdvaan mathijsdev...@gmail.com wrote: Hi, I have a vector like this: DF - c(Aetna, Inc., Alexander's Inc.,

Re: [R] gsub/strsplit with multiple patterns/splits

2012-05-30 Thread mdvaan
Thanks! That works like a charm, but I am not sure if I fully understand the syntax. I looked at the gsub page but still couldn't figure it out. What does the pattern part (,{0,1} Inc[.]{0,1}) do? What do the 0 and 1 within the curly brackets refer to? Also, what if, for example, I would want to