Re: [R] regex - negate a word

2009-01-20 Thread Wacek Kusnierczyk
Prof Brian Ripley wrote: On Mon, 19 Jan 2009, Rolf Turner wrote: On 19/01/2009, at 10:44 AM, Gabor Grothendieck wrote: Well, that's why it was only provided when you insisted. This is not what regexp's are good at. On Sun, Jan 18, 2009 at 4:35 PM, Rau, Roland r...@demogr.mpg.de wrote:

Re: [R] regex - negate a word

2009-01-20 Thread Wacek Kusnierczyk
Wacek Kusnierczyk wrote: attached are patches to character.c, names.c, and grep.R; if you tell me forgot to add: the patches are against the latest r-devel (19.01.2009). compiled and tested on 32b Ubuntu 8.04. vQ __ R-help@r-project.org

Re: [R] regex - negate a word

2009-01-19 Thread Wacek Kusnierczyk
Rolf Turner wrote: On 19/01/2009, at 10:44 AM, Gabor Grothendieck wrote: Well, that's why it was only provided when you insisted. This is not what regexp's are good at. On Sun, Jan 18, 2009 at 4:35 PM, Rau, Roland r...@demogr.mpg.de wrote: Thanks! (I have to admit, though, that I expected

Re: [R] regex - negate a word

2009-01-19 Thread Wacek Kusnierczyk
Stavros Macrakis wrote: On Sun, Jan 18, 2009 at 2:22 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: x[-grep(abc, x)] which unfortunately fails if none of the strings in x matches the pattern, i.e., grep returns integer(0); Yes. arguably, x[integer(0)]

Re: [R] regex - negate a word

2009-01-19 Thread Prof Brian Ripley
On Mon, 19 Jan 2009, Rolf Turner wrote: On 19/01/2009, at 10:44 AM, Gabor Grothendieck wrote: Well, that's why it was only provided when you insisted. This is not what regexp's are good at. On Sun, Jan 18, 2009 at 4:35 PM, Rau, Roland r...@demogr.mpg.de wrote: Thanks! (I have to admit,

[R] regex - negate a word

2009-01-18 Thread Rau, Roland
Dear all, let's assume I have a vector of character strings: x - c(abcdef, defabc, qwerty) What I would like to find is the following: all elements where the word 'abc' does not appear (i.e. 3 in this case of 'x'). Since I am not really experienced with regular expressions, I started slowly

Re: [R] regex - negate a word

2009-01-18 Thread jim holtman
Just remove those elements that match: x - c(abcdef, defabc, qwerty) x[-grep('abc',x)] [1] qwerty On Sun, Jan 18, 2009 at 1:35 PM, Rau, Roland r...@demogr.mpg.de wrote: Dear all, let's assume I have a vector of character strings: x - c(abcdef, defabc, qwerty) What I would like to find

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Rau, Roland wrote: Dear all, let's assume I have a vector of character strings: x - c(abcdef, defabc, qwerty) What I would like to find is the following: all elements where the word 'abc' does not appear (i.e. 3 in this case of 'x'). a quick shot is: x[-grep(abc, x)] which

Re: [R] regex - negate a word

2009-01-18 Thread Gabor Grothendieck
Try this: # indexes setdiff(seq_along(x), grep(abc, x)) # values setdiff(x, grep(abc, x, value = TRUE)) Another possibility is: z - abc x0 - c(x, z) # to handle no match case x0[- grep(z, x0)] # values On Sun, Jan 18, 2009 at 1:35 PM, Rau, Roland r...@demogr.mpg.de wrote: Dear all,

Re: [R] regex - negate a word

2009-01-18 Thread Eric Archer
Roland, I think you were almost there with your first example. Howabout using: x - c(abcdef, defabc, qwerty) y - grep(pattern=abc, x=x) z.char - x[-y] z.index - (1:length(x))[-y] z.char [1] qwerty z.index [1] 3 Cheers, eric Rau, Roland wrote: Dear all, let's assume I have a vector

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Jorge Ivan Velez wrote: Hi Wacek, I think you wanted to say strings instead x in your last line : ) of course, thanks. the correct version is: if(length(matching - grep(pattern, strings))) strings[-matching] else strings btw., and in relation to a recent post complaining about

Re: [R] regex - negate a word

2009-01-18 Thread Rau, Roland
[mailto:ggrothendi...@gmail.com] Sent: Sun 1/18/2009 8:28 PM To: Rau, Roland Cc: r-help@r-project.org Subject: Re: [R] regex - negate a word Try this: # indexes setdiff(seq_along(x), grep(abc, x)) # values setdiff(x, grep(abc, x, value = TRUE)) Another possibility is: z - abc x0 - c(x, z

Re: [R] regex - negate a word

2009-01-18 Thread Gabor Grothendieck
if there is really no regular expression which does the job?!? Thanks again, Roland -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Sun 1/18/2009 8:28 PM To: Rau, Roland Cc: r-help@r-project.org Subject: Re: [R] regex - negate a word Try

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Gabor Grothendieck wrote: Try this: # values setdiff(x, grep(abc, x, value = TRUE)) Another possibility is: z - abc x0 - c(x, z) # to handle no match case x0[- grep(z, x0)] # values on quick testing, these two and the if-based version have comparable runtime, with a minor win for

Re: [R] regex - negate a word

2009-01-18 Thread Gabor Grothendieck
In that case just add fixed = TRUE On Sun, Jan 18, 2009 at 2:58 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Gabor Grothendieck wrote: Try this: # values setdiff(x, grep(abc, x, value = TRUE)) Another possibility is: z - abc x0 - c(x, z) # to handle no match case

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Gabor Grothendieck wrote: In that case just add fixed = TRUE in general, if you want a complex pattern, you don't use 'fixed', and then again you risk incorrect (well, correct for r, but not for the problem) result in case no input string matches the pattern. vQ

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Gabor Grothendieck wrote: Try this: grep(^([^a]|a[^b]|ab[^c])*.{0,2}$, x, perl = TRUE) ... and see how cumbersome it becomes for a pattern as trivial as 'abc'. in perl, you typically don't invent such negative patterns, but rather don't match positive patterns: instead of the match

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Wacek Kusnierczyk wrote: On Sun, Jan 18, 2009 at 2:37 PM, Rau, Roland r...@demogr.mpg.de wrote: Thank you very much to all of you for your fast and excellent help. Since the -grep(...) solution seems to be favored by most of the answers, I just wonder if there is really no regular

Re: [R] regex - negate a word

2009-01-18 Thread Wacek Kusnierczyk
Wacek Kusnierczyk wrote: # r code ungrep = function(pattern, x, ...) grep(paste(pattern, (*COMMIT)(*FAIL)|(*ACCEPT), sep=), x, perl=TRUE, ...) strings = c(abc, xyz) pattern = a[a-z] (filtered = strings[ungrep(pattern, strings)]) # xyz this was a toy example, but if you need this

Re: [R] regex - negate a word

2009-01-18 Thread Rau, Roland
Thanks! (I have to admit, though, that I expected something simple) Thanks, Roland -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Sun 1/18/2009 8:54 PM To: Rau, Roland Cc: r-help@r-project.org Subject: Re: [R] regex - negate a word Try this: grep

Re: [R] regex - negate a word

2009-01-18 Thread Gabor Grothendieck
: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Sun 1/18/2009 8:54 PM To: Rau, Roland Cc: r-help@r-project.org Subject: Re: [R] regex - negate a word Try this: grep(^([^a]|a[^b]|ab[^c])*.{0,2}$, x, perl = TRUE) On Sun, Jan 18, 2009 at 2:37 PM, Rau, Roland r...@demogr.mpg.de wrote

Re: [R] regex - negate a word

2009-01-18 Thread Rolf Turner
On 19/01/2009, at 10:44 AM, Gabor Grothendieck wrote: Well, that's why it was only provided when you insisted. This is not what regexp's are good at. On Sun, Jan 18, 2009 at 4:35 PM, Rau, Roland r...@demogr.mpg.de wrote: Thanks! (I have to admit, though, that I expected something simple)

Re: [R] regex - negate a word

2009-01-18 Thread Gabor Grothendieck
That's an entirely different point from whether regular expressions can do it as grep -v is just another way to do it without using a regular expression to specify the entire job. On Sun, Jan 18, 2009 at 5:02 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: On 19/01/2009, at 10:44 AM, Gabor

Re: [R] regex - negate a word

2009-01-18 Thread Stavros Macrakis
On Sun, Jan 18, 2009 at 2:22 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: x - c(abcdef, defabc, qwerty) ...[find] all elements where the word 'abc' does not appear (i.e. 3 in this case of 'x'). x[-grep(abc, x)] which unfortunately fails if none of the strings in x

Re: [R] regex - negate a word

2009-01-18 Thread Gabor Grothendieck
Note that the variation of this that I posted already handles that case. On Sun, Jan 18, 2009 at 5:32 PM, Stavros Macrakis macra...@alum.mit.edu wrote: On Sun, Jan 18, 2009 at 2:22 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: x - c(abcdef, defabc, qwerty) ...[find] all