To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=66939


User nemeth changed the following:

                What    |Old value                 |New value
================================================================================
                  Status|STARTED                   |RESOLVED
--------------------------------------------------------------------------------
              Resolution|                          |FIXED
--------------------------------------------------------------------------------
        Target milestone|OOo 3.x                   |OOo 3.1
--------------------------------------------------------------------------------




------- Additional comments from [email protected] Thu Feb 26 07:44:00 
+0000 2009 -------
I have attached a compressed he_IL dictionary. I posted a letter to the
Lingucomponent development list about this improvement:

...
Languages with complex morphology can use the second-level affixation
of Hunspell. There is a new tool "doubleaffixcompress"
(http://downloads.sourceforge.net/hunspell/doubleaffixcompress) to
compress the output dictionary of the affixcompress script or other
Hunspell dictionaries using second-level affixes. For example, on the
old en_US dictionary of Openoffice.org we got 50% compression rate:

$ doubleaffixcompress en_US
$ wc -l en_US.dic new_en_US.dic
 62157 en_US.dic
 30442 new_en_US.dic
$ grep abolish en_US.dic
abolisher/M
abolish/LZRSDG
abolishment/MS
$ grep abolish new_en_US.dic
abolish/5193,6535,64991,64993,64995,64996,64997,65001
$ grep '\(5193\|6535\)' new_en_US.aff
SFX  5193 Y 1
SFX  5193 0 er/64999 .
SFX  6535 Y 1
SFX  6535 0 ment/64997,64999 .

A more important result on the (too big) he_IL dictionary. (This
dictionary recognizes more than 100 million Hebrew word forms):

$ LC_ALL=C doubleaffixcompress he_IL
$ wc he_IL.dic new_he_IL.dic
 329237  328996 3212113 he_IL.dic
 37913   37879 1940612 new_he_IL.dic
$ LC_ALL=C ~/hunspell-1.2.8/src/tools/makealias new_he_IL.{dic,aff}
output: new_he_IL_alias.dic, new_he_IL_alias.aff

Memory usage has been reduced from 19 MB to 5.5 MB by
doubleaffixcompress and makealias.
...

Also the big loading time reduced to a few tenth of a second, so this issue has
been fixed.


---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to