On 20 May 2008, at 08:46, Tomeu Vizoso wrote:

> what about stemming the words? You may be able to use an english
> stemmer from xapian using the python bindings (not sure though).

Thanks Tomeu,

Yes, I was doing some very rudimentary stemming in my text parsing for  
specific cases, but I was still undecided as some of the other texts  
I've mapped have interesting usage patterns with the stems left as is.

I've just had a quick google and found the standard Porter Stemming  
Algorithm written in Python by Vivake Gupta, I'll plug it in and see  
how it goes – mail list and wiki type texts are probably less 'edited'  
and more noisy than the other texts I've been experimenting with :-)

--Gary
_______________________________________________
Sugar mailing list
[email protected]
http://lists.laptop.org/listinfo/sugar

Reply via email to