As those who've been paying attention to it know, our regular expression library is based on code originally developed by Henry Spencer and contributed by him to the Tcl project. We adopted it out of Tcl in 2003. Henry intended to package the code as a standalone library as well, but that never happened --- AFAICT, Henry dropped off the net around 2002, and I have no idea what happened to him.
Since then, we've been acting as though the Tcl guys are upstream maintainers for the regex code, but in point of fact there does not appear to be anybody there with more than the first clue about that code. This was brought home to me a few days ago when I started talking to them about possible ways to fix the quantified-backrefs problem that depesz recently complained of (which turns out to have been an open bug in their tracker since 2005). As soon as I betrayed any indication of knowing the difference between a DFA and an NFA, they offered me commit privileges :-(. And they haven't fixed any other significant bugs in the engine in years, either. So I think it's time to face facts and accept that Tcl are not a useful upstream for the regex code. And we can't just let it sit quietly, because we have bugs to fix (at least the one) as well as enhancement requests such as the nearby discussion about recognizing high Unicode code points as letters. A radical response to this would be to drop the Spencer regex engine and use something else instead --- probably PCRE, because there are not all that many alternatives out there. I do not care much for this idea though. It would be a significant amount of work in itself, and there's no real guarantee that PCRE will continue to be maintained either, and there would be some user-visible compatibility issues because the regex flavor is a bit different. A larger point is that it'd be a real shame for the Spencer regex engine to die off, because it is in fact one of the best pieces of regex technology on the planet. See Jeffrey Friedl's "Mastering Regular Expressions" (O'Reilly) --- at least, that's what he thought in the 2002 edition I have, and it's unlikely that things have changed much. So I'm feeling that we gotta suck it up and start acting like we are the lead maintainers for this code, not just consumers. Another possible long-term answer is to finish the work Henry never did, that is make the code into a standalone library. That would make it available to more projects and perhaps attract other people to help maintain it. However, that looks like a lot of work too, with distant and uncertain payoff. Comments, other ideas? regards, tom lane -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers