> The problem is solved - thank you.
>
> Thomas
>
>
>
> Von:    Thomas Eckardt <[email protected]>
> An:     ASSP development mailing list <[email protected]>
> Datum:  31.07.2011 12:04
> Betreff:        [Assp-test] help wanted for word stemming
>
>
>
> Hi all,
>
> I'm currently working on a word stemming engine for the assp - Bayesian
> check. This engine converts words to its stem from, for example plural,
> sigular,future,present,past ....
>
> The Perl module 'Lingua::Stem' is used to do this.
>
> Currently supported languages by this module are :
>
>      DA          - Danish
>      DE          - German
>      EN          - English (also EN-US und EN-UK)
>      FR          - French
>      GL          - Galician
>      IT          - Italian
>      NO          - Norwegian
>      PT          - Portuguese
>      RU          - Russian (also RU-RU und RU-RU.KOI8-R)
>      SV          - Swedish
>
>
> It would be nice, if this assp stemming engine could detect in which
> language the text to convert is written. Currently a default has to be set
>
> in the code.
>
> - For 'EN' the detection is still the occurency of any of these words:
> /\b(?:are|your?|she|here|his|he|there|this|these|have|has|the|those)\b/io
> - For 'DE' I'll find any similiar - no problem
>
> What I need - is a small list of common language unique(!!!) words for the
>
> other languages. Any help is welcome.
>
> Thomas


Hi Thomas,

You do not *explain* how the problem was / is 'solved' :o)
Did you access an Angel gifted with tongues in languages?

Peter 


------------------------------------------------------------------------------
Got Input?   Slashdot Needs You.
Take our quick survey online.  Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
_______________________________________________
Assp-test mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to