On Fri, 24 Oct 2003, Kurt Hornik wrote: > >>>>> Prof Brian Ripley writes: > > > A couple of weeks back there was some discussion about documenting the > > regular expressions as used in R. Several years ago the problem was > > that this was OS-dependent, and to plug that problem we incorporated > > regexp code from a version of GNU grep, later updated to grep-2.4.2 in > > R 1.2.0. > > > I have been looking at documenting what grep(perl=TRUE) does, and we > > have a similar problem in that the current PCRE, 4.4, implements > > rather more of Perl's regexps than 3.9 (which is in 1.8.0 if the OS > > does not supply it, and RH8.0 has PCRE 3.9. Whichever version of > > Debian is on franz has PCRE 3.4). > > > I could add a configure check for PCRE >= 4.0, and I think probably > > should do that. However, my inclination is to always use the version > > of PCRE in the R sources and thereby ensure that all builds of R have > > the same version, the one I will document. Comments, please. > > I think we should in any case allow maintainers of binary packages on > platforms with advanced package management systems to force the use of > shared libraries the system can provide. (So the binary maintainers > would need to verify that the system package provides the right libs and > headers.) > > Not sure about the default: we typically try to use available system > resources, unless this is bound to cause problems, and regex was of the > latter type, afaicr.
With a configure check for >= 4.0 I am reasonably happy to have --without-pcre as the default and allow --with-pcre at people's peril. > > For PCRE 4.4 there is a long man page that I will use as a basis for > > the documentation. I am inclined just to include either a text or PDF > > version of the man page -- any preferences for which form? > > Depends on where you would put the docs, I think. Btw, where can 4.4 be > found? At the ftp site mentioned on ?grep, at least earlier this week. > > For the non-Perl regexps it is harder, as I am unsure exactly what > > patterns the GNU regex we have accepts. (From a problem which > > occurred with some Sweave regexps, I think it accepts more than it is > > intended to.) One fairly good docu source is the GNU grep man page: > > does anyone know a better one? I had thought of writing a regexp.Rd > > help page to which grep.Rd could refer. > > That would be great. Linux has a regex(7) purported to be "taken from > Henry Spencer's regex package", which might be used as a start. The old > GNU regex .tar.gz has a texinfo file, but does not help for what we > need, I think. The GNU grep 2.4.2 man page and texinfo file give me enough, except I don't understand them well enough. (What is said about extended vs basic expressions is unclear at best). The Solaris 8 man pages are better and they do document POSIX regexps, so I will use some of their ideas. > [I recently looked for available regexp docs, but was not too > successful.] > > > None of this is imminent (I am too busy) but is intended for the next > > minor release (which may be called 1.9.0 or 2.0.0, I gather). > > Too bad :-( I might try to put regexp.Rd (I have a start) in 1.8.1 then. Bu thte PCRE stuff will need to wait for R-devel's release. Brian -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel