Hi, OpenOffice.org test builds with these new features for thesaurus/dictionary developers:
Windows: http://hunspell.sourceforge.net/Windows080715/en-US.zip Fedora 9 x86-64: http://hunspell.sourceforge.net/OOo_3.0.0_080716_unxlngx6_install.tar.gz Regards, László 2008/6/27 Németh László <[EMAIL PROTECTED]>: > Hi, > > I guess I forgot to mention, I made a demo version from the standalone > MyThes thesaurus with stemming and morphological generation half a > year ago. It doesn't handle multiword expressions or general > categories before parenthesis, like the code in the CWS > "hunspell4thesaurus", but it may be useful for dictionary developers: > > http://downloads.sourceforge.net/hunspell/MyThes-1.1.tar.gz > > See README.NEW and README for compiling. > > Test example > > Make an input.txt file with two lines, "rodents" and "consumed", and > run MyThes with the > test dictionary: > ./example morph.idx morph.dat input.txt morph.aff morph.dic > > Thesaurus uses encoding ISO8859-1 > > stem: rodent > rodent has 1 meanings > meaning 0: (n) mouse > mice > > stem: consume > consume has 1 meanings > meaning 0: (v) eat > eaten, ate > ingested > > The example Hunspell dictionary (meanings of the morphological fields: > po: part of speech category > ts: terminal suffix > al: allomorph > st: stem > is: inflectional suffix, see > http://sourceforge.net/docman/display_doc.php?docid=29374&group_id=143754#Morphological%20analysis): > > $ cat morph.dic > 8 > rodent/S po:n ts:nom > mouse po:n al:mice ts:nom > mice po:n st:mouse is:plur > consume/TQD po:v ts:present > ingest/TQD po:v ts:present > eat/QT po:v al:ate al:eaten ts:present > ate po:v st:eat is:past_1 > eaten po:v st:eat is:past_2 > > $ cat morph.aff > # example for morphological analysis, stemming and generation > SFX D Y 4 > SFX D 0 ed [^e] is:past_1 > SFX D 0 d e is:past_1 > SFX D 0 ed [^e] is:past_2 > SFX D 0 d e is:past_2 > > SFX S Y 1 > SFX S 0 s . is:plur > > SFX Q Y 1 > SFX Q 0 s . is:sg_3 > > SFX T Y 2 > SFX T 0 ing [^e] is:pr_part > SFX T e ing e is:pr_part > > and the thesaurus (without any extra morphological information): > > $ cat morph.dat > ISO8859-1 > mouse|1 > (n)|rodent > rodent|1 > (n)|mouse > eat|1 > (v)|consume|ingest > consume|1 > (v)|eat|ingest > ingest|1 > (v)|eat|consume > > Regards, > Laci > > 2008/6/23 Németh László <[EMAIL PROTECTED]>: >> Hi Daniel, >> >> 2008/6/20 Daniel Naber <[EMAIL PROTECTED]>: >>> On Freitag, 20. Juni 2008, Németh László wrote: >>> >>>> "hunspell4thesaurus" contains Hunspell 1.2.4 and a thesaurus patch to >>>> use Hunspell for stemming of the selected words and morphological >>>> generation of the synonyms in OpenOffice.org 3. >>> >>> Hi Laci, >>> >>> thank you, that's great news! Please keep this list up-to-date about when >>> this is available in a new build (because it can be quite difficult to >>> follow the changes in the release notes). >> >> The CWS hunspell4thesaurus (and CWS hyphenator3 with the new compound >> word hyphenation support) are finished and tested on my Linux, but QA >> needs Linux and Windows test builds, too. I have no Windows build >> environment, and it seems, my recent Linux test builds have some >> problems >> (http://eis.services.openoffice.org/EIS2/cws.ShowCWS?Path=DEV300%2Fhunspell4thesaurus), >> so any help welcome. >> I hope, within a few days I will have a newer Linux build environment >> and I could send a link to a working Linux test build to the list. >> (But the standalone version of Hunspell is suitable for the dictionary >> development.) >> >> Regards, >> Laci >> >> >> >>> >>> Regards >>> Daniel >>> >>> -- >>> http://www.danielnaber.de >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
