On 20 May 2008, at 08:46, Tomeu Vizoso wrote: > what about stemming the words? You may be able to use an english > stemmer from xapian using the python bindings (not sure though).
Thanks Tomeu, Yes, I was doing some very rudimentary stemming in my text parsing for specific cases, but I was still undecided as some of the other texts I've mapped have interesting usage patterns with the stems left as is. I've just had a quick google and found the standard Porter Stemming Algorithm written in Python by Vivake Gupta, I'll plug it in and see how it goes – mail list and wiki type texts are probably less 'edited' and more noisy than the other texts I've been experimenting with :-) --Gary _______________________________________________ Sugar mailing list [email protected] http://lists.laptop.org/listinfo/sugar

