I've had a new PHP extension ready to be added into PHP for a while now but I've never gotten around to offering it up.
Basically, it's a Porter suffix stemmer. We use it at work for a search engine we're working on, and since we've been using PHP so much and benefitting from it's open source nature, we've decided to try and give back a little. There's only one function in the extension, porter(), which basically takes in a string and returns it's stem after stripping off the suffix (or suffixes). The prototype, then, is obviously string porter(string word) On success, the function returns word's stem in uppercase. On error, it returns "-1". Errors only arise when word cannot be stemmed, i.e. it contains non-sense characters. (Basically any non-alphabetic characters -- anything that isn't [a-zA-Z].) A quick example: <?php print porter("assassin") . "\n"; print porter("assassinate") . "\n"; print porter("assassination") . "\n"; print porter("assassinations") . "\n"; print porter("assassinations111") . "\n"; ?> gives: ASSASSIN ASSASSIN ASSASSIN ASSASSIN -1 One problem I can see with the extension -- it's partially written in C++ with an interface written in C so it can talk to PHP. It compiles fine on the latest 4.1.0 RCs (both 1 and 2) and seems to compile fine with 4.2.0dev. It does make compiling with Apache a bit weird, though -- Apache will spit out errors about the C++ string library if you're using a C compiler (which you're pretty much forced to do with Apache). A quick remedy is to open up $APACHE_HOME/src/Makefile after you get the errors and change the line that reads "CC=gcc" to "CC=g++" (or whatever your C and C++ compilers are called). Any interest in the extension? If so, it's up for grabs for inclusion into any future versions of PHP. If not, it's still up for grabs for anyone who would like to use it. Just drop me a line. J -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]