> The problem is solved - thank you. > > Thomas > > > > Von: Thomas Eckardt <[email protected]> > An: ASSP development mailing list <[email protected]> > Datum: 31.07.2011 12:04 > Betreff: [Assp-test] help wanted for word stemming > > > > Hi all, > > I'm currently working on a word stemming engine for the assp - Bayesian > check. This engine converts words to its stem from, for example plural, > sigular,future,present,past .... > > The Perl module 'Lingua::Stem' is used to do this. > > Currently supported languages by this module are : > > DA - Danish > DE - German > EN - English (also EN-US und EN-UK) > FR - French > GL - Galician > IT - Italian > NO - Norwegian > PT - Portuguese > RU - Russian (also RU-RU und RU-RU.KOI8-R) > SV - Swedish > > > It would be nice, if this assp stemming engine could detect in which > language the text to convert is written. Currently a default has to be set > > in the code. > > - For 'EN' the detection is still the occurency of any of these words: > /\b(?:are|your?|she|here|his|he|there|this|these|have|has|the|those)\b/io > - For 'DE' I'll find any similiar - no problem > > What I need - is a small list of common language unique(!!!) words for the > > other languages. Any help is welcome. > > Thomas
Hi Thomas, You do not *explain* how the problem was / is 'solved' :o) Did you access an Angel gifted with tongues in languages? Peter ------------------------------------------------------------------------------ Got Input? Slashdot Needs You. Take our quick survey online. Come on, we don't ask for help often. Plus, you'll get a chance to win $100 to spend on ThinkGeek. http://p.sf.net/sfu/slashdot-survey _______________________________________________ Assp-test mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-test
