On Mon, 8 Jan 2007, Hin-Tak Leung wrote: > May I chip in at this point - I agree the bug report was invalid, but > many of the replies were missing the point, as far as I see. It > wasn't the backslash escape that "Fan" is *mainly* confused about > (which he obviously is...), but the uses of the different brackets: > [] ,() . > > He/She was expecting this: > egrep '[a|\-|c]' foo.txt > to work the same as: > egrep '(a|\-|c)' foo.txt > > which they do not. They are totally different. (and he doesn't know > the proper use of "|" either... so we basically have established that > "Fan" doesn't understand how \, |, [] and () are used in > regular expressions...).
And I think *you* have missed the point of the posting you quoted here, which was to show how '[a|-|c]' actually worked in 'other software'. How '[a|\-|c]' (or '(a|\-|c)) works in egrep is explicitly undefined by POSIX, as I said in my original reply. > > HTL > > [EMAIL PROTECTED] wrote: >> Both Solaris 8 grep and GNU grep 2.5.1 give >> >> gannet% cat > foo.txt >> a-a >> b >> gannet% egrep '[d|-|c]' foo.txt >> gannet% egrep '[-|c]' foo.txt >> a-a >> >> agreeing exactly with R (and the POSIX standard) and contradicting 'Fan'. >> >> >> On Thu, 4 Jan 2007, Fan wrote: >> >>> Let me detail a bit my bug report: >>> >>> the two commands ("expected" vs "strange") should return the >>> same result, the objective of the commands is to test the presence >>> of several characters, '-'included. >>> >>> The order in which we specify the different characters must not be >>> an issue, i.e., to test the presence of several characters, including >>> say char_1, the regular expressions [char_1|char_2|char_3] and >>> [char_2|char_1|char_3] should play the same role. Other softwares >>> work just like this. >>> >>> What's reported is that R actually returns different result for the >>> character "-" (\- in a RE) regarding it's position in the regular >>> expression, and the "perl" option would not be relevant. >> >> As described in the relevant international standard and R's own >> documentation. >> >>> Prof Brian Ripley wrote: >>>> Why do you think this is a bug in R? You have not told us what you >>>> expected, but the character range |-| contains only | . Not agreeing with >>>> your expectations (unstated or otherwise) is not a bug in R. >>>> >>>> \- is the same as -, and - is special in character classes. (If it is >>>> first or last it is treated literally.) And | is not a metacharacter >>>> inside a character class. Also, >>>> >>>>> grep("[d\\-c]", c("a-a","b")) >>>> [1] 1 2 >>>> >>>>> grep("[d\\-c]", c("a-a","b"), perl=TRUE) >>>> [1] 1 >>>> >>>> shows that escaping - works only in perl (which you will find from the >>>> background references mentioned, e.g. >>>> >>>> The interpretation of an ordinary character preceded by a backslash >>>> ('\') is undefined. >>>> >>>> .) >>>> >>>> This is all carefully documented in ?regexp, e.g. >>>> >>>> Patterns are described here as they would be printed by 'cat': do >>>> remember that backslashes need to be doubled in entering R >>>> character strings from the keyboard. >>>> >>>> >>>> This is not the first time you have wasted our resources with false bug >>>> reports, so please show more respect for the R developers' time. >>>> You were also explicitly asked not to report on obselete versions of R. >>>> >>>> On Wed, 3 Jan 2007, [EMAIL PROTECTED] wrote: >>>> >>>>> Full_Name: FAN >>>>> Version: 2.4.0 >>>>> OS: Windows >>>>> Submission from: (NULL) (159.50.101.9) >>>>> >>>>> >>>>> These are expected: >>>>> >>>>>> grep("[\-|c]", c("a-a","b")) >>>>> [1] 1 >>>>> >>>>>> gsub("[\-|c]", "&", c("a-a","b")) >>>>> [1] "a&a" "b" >>>>> >>>>> but these are strange: >>>>> >>>>>> grep("[d|\-|c]", c("a-a","b")) >>>>> integer(0) >>>>> >>>>>> gsub("[d|\-|c]", "&", c("a-a","b")) >>>>> [1] "a-a" "b" >>>>> >>>>> Thanks >>>>> >>>>> ______________________________________________ >>>>> R-devel@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-devel >>>>> >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel