On Sat, 10 Jun 2006, Prof Brian Ripley wrote: > ?regex does describe this: > > A range of characters may be specified by giving the first and last > characters, separated by a hyphen. (Character ranges are > interpreted in the collation order of the current locale.) > > You did not tell us your locale, but based on questions from you in the past > I would guess en_NZ.utf8. In that locale the collation order is wWxXyYzZ, so > your surprise is explained. (It seems the PCRE code is not using the same > ordering in that locale.)
Some digging shows that Perl does not say explicitly what order it uses (at least in the man pages on my system), but that PCRE uses (see man pcrepattern) - numerical order of the bytes in a single-byte locale - numerical order of Unicode points in a UTF-8 locale. whereas the basic/extended code uses the order set by the locale category LC_COLLATE and interpreted by the C function wcscoll (and byte order if that is not available). Gabor Grothendieck <ggrothendieck at gmail.com> worte: > I get the same thing on "Version 2.3.1 Patched (2006-06-04 r38279)" > but on "R version 2.2.1, 2005-12-20" it gives character(0), as > expected, so there is some change between versions of R. I am > on Windows XP. And a helpful person would have studied the CHANGES file before commenting! It says: Internationalization -------------------- There is no longer a separate 'East Asian' version of R.dll. In R 2.2.1 the fully internationalized version behaved as 2.3.1 did, but the 8-bit-only version for Windows always used byte-order collation. The difference is most likely that GG was using the 8-bit-only version, a Windows-specific issue. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
