[lingucomponent-issues] [Issue 61459] New - Hunspell 1.1.3 with some new fixes

nemeth Tue, 31 Jan 2006 16:47:28 -0800

To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=61459
                  Issue #:|61459
                  Summary:|Hunspell 1.1.3 with some new fixes
                Component:|lingucomponent
                  Version:|OOo 2.0.1
                 Platform:|All
                      URL:|
               OS/Version:|All
                   Status:|NEW
        Status whiteboard:|
                 Keywords:|
               Resolution:|
               Issue type:|PATCH
                 Priority:|P3
             Subcomponent:|spell checking
              Assigned to:|nemeth
              Reported by:|nemeth






------- Additional comments from [EMAIL PROTECTED] Tue Jan 31 16:47:27 -0800 
2006 -------
In the CWS "hunspell01".

Improvements: tokenisation and COMPOUNDRULE fixes, improved suggestions
and German ß handling. Optional alias compression in data files (useful for
Arabic dictionary and other affix rich languages).

Changelog:

2006-01-30 Németh László <[EMAIL PROTECTED]>:
        * src/parsers/textparser.cxx: fix Unicode tokenization in is_wordchar()
          (extra word characters (WORDCHARS) didn't work on big-endian 
platforms).
          
        * src/hunspell/{csutil,affixmgr}.cxx: inline isSubset(), isRevSubset():
          little speed optimalisation for languages with rich morphology.

        * src/tools/hunspell.cxx: fix bad --with-ui and --with-readline 
compiling
          when (N)curses is missing. Reported by Daniel Naber.

2006-01-19 Tor Lillqvist <[EMAIL PROTECTED]>
        * src/hunspell/csutil.cxx: mystrsep(): fix locale-dependent isspace()
tokenisation

2006-01-06 András Tímár <[EMAIL PROTECTED]>
        * src/hunspell/{hashmgr.hxx,hunspell.cxx}: fix Visual C++ compiling 
errors

2006-01-05 Németh László <[EMAIL PROTECTED]>:
        * COPYING: set GPL/LGPL/MPL tri-license for Mozilla integration.
          Rationale: Mozilla source code contains an old MySpell version
          with GPL/LGPL/MPL tri-license. (MPL license is a copyleft license, 
similar
          to the LGPL, but it acts on file level.)
        * COPYING.LGPL: GNU Lesser General Public License 2.1 (LGPL)
        * COPYING.MPL: Mozilla Public License 1.1 (MPL)
        * license.hunspell, src/hunspell/license.hunspell: GPL/LGPL/MPL 
tri-license

        * src/hunspell/{affixmgr,hashmgr}.*: AF, AM alias definitions in affix 
file:
          compression of flag sets and morphological descriptions (see manual,
          and tests/alias* test files).
          Rationale: Alias compression is also good for loading time and memory
          efficiency, not only smaller resources.
        * src/tools/makealias: alias compression utility
          (usage: ./makealias file.dic file.aff)
        * tests/alias{,2,3}: AF, AM tests
        * man/hunspell.4: add AF, AM documentation
        * src/hunspell/affentry.cxx, atypes.hxx: add new opts bits (aeALIASM,
aeALIASF)

        * tools/hunspell, src/parser/*, src/hunspell/*: Hunspell program
          tokenizes Unicode texts (only with UTF-8 encoded dictionaries).
          Missing Unicode tokenization reported by Björn Jacke, Egmont 
Koblinger, 
          Jess Body and others.
          Note: Curses interactive interface hasn't worked perfectly yet.
        * tests/*.tests: remove -1 parameters of Hunspell
        * tests/*.{good,wrong}: remove tabulators

        * src/hunspell/{hunspell,affixmgr}.cxx: BREAK option: break words at
          specified break points and checking word parts separately (see 
manual).
          Note: COMPOUNDRULE is better (or will be better) for handling dashes 
and
          other compound joining characters or character strings. Use BREAK, if 
you
          want check words with dashes or other joining characters and there is
no time
          or possibility to describe precise compound rules with COMPOUNDRULE.
        * tests/break.*: BREAK example.

        * src/hunspell/{affixmgr,hunspell}.cxx: add CHECKSHARPS declaration 
instead
          of LANG de_DE definitions to handle German sharp s in both spelling 
and
          suggestion.
        * src/hunspell/hunspell.cxx: With CHECKSHARPS, uppercase words are valid
          with both lower sharp s (it's is optional for names in German legal 
texts)
          and SS (MÜßIG, MÜSSIG). Missing lower sharp s form reported by Björn
Jacke. 
        * src/hunspell/hunspell.cxx: KEEPCASE flag on a sharp s word has a 
special
          meaning with CHECKSHARPS declaration: KEEPCASE permits capitalisation
and SS upper 
          casing of a sharp s word (Müßig and MÜSSIG), but forbids the upper
cased form 
          with lower sharp s character(s): *MÜßIG.
        * tests/germancompounding*: add CHECKSHARPS, remove LANG
        * tests/checksharps*: add CHECKSHARPS and KEEPCASE, remove LANG

        * src/hunspell/hunspell.cxx: improved suggestions:
        - suggestions for pressed Caps Lock problems: macARONI -> macaroni
        - suggestions for long shift problems: MAcaroni -> Macaroni, macaroni
        - suggestions for KEEPCASE words: KG -> kg
        * src/hunspell/csutil.cxx: fix mystrrep() function:
        - suggestions for lower sharp s in uppercased words: MÜßIG -> MÜSSIG
        * tests/checksharps{,utf}.sug: add tests for mystrrep() fix

        * src/hunspell/hashmgr.cxx: Now dictionary words can contain slashes 
          with the "\/" syntax. Problem reported by Frederik Fouvry.

        * src/hunspell/hunspell.cxx: fix bad duplicate filter in suggest().
          (Suggesting some capitalised compound words caused program crash
          with Hungarian dictionary, OOo Issue 59055).

        * src/hunspell/affixmgr.cxx: fix bad defcpd_check() call in
compound_check().
          (Overlapping new COMPOUNDRULE and old compounding methods caused 
program
           crash at suggestion.)

        * src/hunspell/affixmgr.{cxx,hxx}: check affix flag duplication at affix
classes.
          Suggested by Daniel Naber.

        * src/hunspell/affentry.cxx: remove unused variable declarations (OOo
i58338).
          Compiler warnings reported by András Tímár and Martin Hollmichel.

        * src/hunspell/hunspell.cxx: morph(): not analyse bad mixed uppercased 
forms
          (fix Arabic morphological analysis with Buckwalter's Arabic
transliteration)

        * src/hunspell/affentry.{cxx,hxx}, atypes.hxx: little memory 
optimization
          in affentry:
          - using unsigned char fields instead of short (stripl, appndl, 
numconds)
          - rename xpflg field to opts
          - removing utf8 field, use aeUTF8 bit of opts field

        * configure.ac: set tests/maputf.test to XFAILED on ARM platform.
          Fail reported by Rene Engelhard.

        * configure.ac: link Ncursesw library, if exists.

        * BUGS: add BUGS file

        * tests/complexprefixes2.*: test for morphological analysis with
COMPLEXPREFIXES 
        
        * src/hunspell/affixmgr.cxx: use "COMPOUNDRULE" instead of
          "COMPOUND". The new name suggested by Bram Moolenaar.
        * tests/compoundrule*: modified and renamed compound.* test files

        * man/hunspell.4: AF, AM, BREAK, CHECKSHARPS, COMPOUNDRULE, KEEPCASE.
          - also new addition to the documentation:
          Header of the dictionary file define approximate dictionary size:
          ``A dictionary file (*.dic) contains a list of words, one per line.
          The first line of the dictionaries (except personal dictionaries)
          contains the _approximate_ word count (for optimal hash memory 
size).''
          Asked by Frederik Foudry.
          
          One-character replacements in REP definitions: ``It's very useful to
          define replacements for the most typical one-character mistakes, too:
          with REP you can add higher priority to a subset of the TRY 
suggestions
          (suggestion list begins with the REP suggestions).''

---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[lingucomponent-issues] [Issue 61459] New - Hunspell 1.1.3 with some new fixes

Reply via email to