Hi,

OpenOffice.org test builds with these new features for
thesaurus/dictionary developers:

Windows: http://hunspell.sourceforge.net/Windows080715/en-US.zip
Fedora 9 x86-64:
http://hunspell.sourceforge.net/OOo_3.0.0_080716_unxlngx6_install.tar.gz

Regards,
László

2008/6/27 Németh László <[EMAIL PROTECTED]>:
> Hi,
>
> I guess I forgot to mention, I made a demo version from the standalone
> MyThes thesaurus with stemming and morphological generation half a
> year ago. It doesn't handle multiword expressions or general
> categories before parenthesis, like the code in the CWS
> "hunspell4thesaurus", but it may be useful for dictionary developers:
>
> http://downloads.sourceforge.net/hunspell/MyThes-1.1.tar.gz
>
> See README.NEW and README for compiling.
>
> Test example
>
> Make an input.txt file with two lines, "rodents" and "consumed", and
> run MyThes with the
> test dictionary:
> ./example morph.idx morph.dat input.txt morph.aff morph.dic
>
> Thesaurus uses encoding ISO8859-1
>
> stem: rodent
> rodent has 1 meanings
>   meaning 0: (n) mouse
>       mice
>
> stem: consume
> consume has 1 meanings
>   meaning 0: (v) eat
>       eaten, ate
>       ingested
>
> The example Hunspell dictionary (meanings of the morphological fields:
> po: part of speech category
> ts: terminal suffix
> al: allomorph
> st: stem
> is: inflectional suffix, see
> http://sourceforge.net/docman/display_doc.php?docid=29374&group_id=143754#Morphological%20analysis):
>
> $ cat morph.dic
> 8
> rodent/S        po:n        ts:nom
> mouse   po:n    al:mice ts:nom
> mice    po:n st:mouse        is:plur
> consume/TQD     po:v ts:present
> ingest/TQD      po:v ts:present
> eat/QT  po:v    al:ate  al:eaten        ts:present
> ate     po:v    st:eat  is:past_1
> eaten   po:v    st:eat  is:past_2
>
> $ cat morph.aff
> # example for morphological analysis, stemming and generation
> SFX D Y 4
> SFX D   0 ed [^e] is:past_1
> SFX D   0 d e     is:past_1
> SFX D   0 ed [^e] is:past_2
> SFX D   0 d e     is:past_2
>
> SFX S Y 1
> SFX S   0 s . is:plur
>
> SFX Q Y 1
> SFX Q   0 s . is:sg_3
>
> SFX T Y 2
> SFX T   0 ing [^e] is:pr_part
> SFX T   e ing e    is:pr_part
>
> and the thesaurus (without any extra morphological information):
>
> $ cat morph.dat
> ISO8859-1
> mouse|1
> (n)|rodent
> rodent|1
> (n)|mouse
> eat|1
> (v)|consume|ingest
> consume|1
> (v)|eat|ingest
> ingest|1
> (v)|eat|consume
>
> Regards,
> Laci
>
> 2008/6/23 Németh László <[EMAIL PROTECTED]>:
>> Hi Daniel,
>>
>> 2008/6/20 Daniel Naber <[EMAIL PROTECTED]>:
>>> On Freitag, 20. Juni 2008, Németh László wrote:
>>>
>>>> "hunspell4thesaurus" contains Hunspell 1.2.4 and a thesaurus patch to
>>>> use Hunspell for stemming of the selected words and morphological
>>>> generation of the synonyms in OpenOffice.org 3.
>>>
>>> Hi Laci,
>>>
>>> thank you, that's great news! Please keep this list up-to-date about when
>>> this is available in a new build (because it can be quite difficult to
>>> follow the changes in the release notes).
>>
>> The CWS hunspell4thesaurus (and CWS hyphenator3 with the new compound
>> word hyphenation support) are finished and tested on my Linux, but QA
>> needs Linux and Windows test builds, too. I have no Windows build
>> environment, and it seems, my recent Linux test builds have some
>> problems 
>> (http://eis.services.openoffice.org/EIS2/cws.ShowCWS?Path=DEV300%2Fhunspell4thesaurus),
>> so any help welcome.
>> I hope, within a few days I will have a newer Linux build environment
>> and I could send a link to a working Linux test build to the list.
>> (But the standalone version of Hunspell is suitable for the dictionary
>> development.)
>>
>> Regards,
>> Laci
>>
>>
>>
>>>
>>> Regards
>>>  Daniel
>>>
>>> --
>>> http://www.danielnaber.de
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to