To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=66939
User nemeth changed the following:
What |Old value |New value
================================================================================
Status|STARTED |RESOLVED
--------------------------------------------------------------------------------
Resolution| |FIXED
--------------------------------------------------------------------------------
Target milestone|OOo 3.x |OOo 3.1
--------------------------------------------------------------------------------
------- Additional comments from [email protected] Thu Feb 26 07:44:00
+0000 2009 -------
I have attached a compressed he_IL dictionary. I posted a letter to the
Lingucomponent development list about this improvement:
...
Languages with complex morphology can use the second-level affixation
of Hunspell. There is a new tool "doubleaffixcompress"
(http://downloads.sourceforge.net/hunspell/doubleaffixcompress) to
compress the output dictionary of the affixcompress script or other
Hunspell dictionaries using second-level affixes. For example, on the
old en_US dictionary of Openoffice.org we got 50% compression rate:
$ doubleaffixcompress en_US
$ wc -l en_US.dic new_en_US.dic
62157 en_US.dic
30442 new_en_US.dic
$ grep abolish en_US.dic
abolisher/M
abolish/LZRSDG
abolishment/MS
$ grep abolish new_en_US.dic
abolish/5193,6535,64991,64993,64995,64996,64997,65001
$ grep '\(5193\|6535\)' new_en_US.aff
SFX 5193 Y 1
SFX 5193 0 er/64999 .
SFX 6535 Y 1
SFX 6535 0 ment/64997,64999 .
A more important result on the (too big) he_IL dictionary. (This
dictionary recognizes more than 100 million Hebrew word forms):
$ LC_ALL=C doubleaffixcompress he_IL
$ wc he_IL.dic new_he_IL.dic
329237 328996 3212113 he_IL.dic
37913 37879 1940612 new_he_IL.dic
$ LC_ALL=C ~/hunspell-1.2.8/src/tools/makealias new_he_IL.{dic,aff}
output: new_he_IL_alias.dic, new_he_IL_alias.aff
Memory usage has been reduced from 19 MB to 5.5 MB by
doubleaffixcompress and makealias.
...
Also the big loading time reduced to a few tenth of a second, so this issue has
been fixed.
---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]