Bram Moolenaar <[email protected]>: > Dominique Pelle wrote: > >> Vim-7.3 currently creates spelling dictionaries from Myspell dictionaries. >> I am wondering whether there is any plan to support Hunspell dictionaries. >> >> The French dictionary at http://www.dicollecte.org/download.php?prj=fr >> states: >> >> === [ fr] === >> Ces dictionnaires pour Myspell ne seront plus mis à jour, Myspell ayant >> été remplacé par Hunspell dans la plupart des applications. >> ========= >> >> Which means in English: >> ========= >> These dictionaries for Myspell won't be kept up-to-date, Myspell >> being replaced by Hunspell in most applications. >> ========= >> >> It's a pity if we can't use the latest dictionaries in Vim anymore. >> I have no idea how much work is involved in supporting Hunspell. >> >> When trying to run :mkspell on the Hunspell French dictionary, >> available at... >> >> http://www.dicollecte.org/download/fr/hunspell-fr-moderne-v3.8.zip >> >> ... Vim reports the following messages: >> >> Unrecognized or duplicate item in fr-moderne.aff line 10: WORDCHARS >> Unrecognized or duplicate item in fr-moderne.aff line 98: KEY >> Unrecognized or duplicate item in fr-moderne.aff line 100: ICONV >> ...snip... >> Unrecognized or duplicate item in fr-moderne.aff line 135: OCONV >> Unrecognized or duplicate item in fr-moderne.aff line 154: BREAK >> Unrecognized or duplicate item in fr-moderne.aff line 155: BREAK >> Reading dictionary file fr-moderne.dic ... >> First duplicate word in fr-moderne.dic line 3815: V >> 392 duplicate word(s) in fr-moderne.dic >> Compressing word tree... >> Compressed 4390813 of 4735831 nodes; 345018 (7%) remaining >> Compressed 313845 of 391932 nodes; 78087 (19%) remaining >> Writing spell file fr.utf-8.spl ... >> Done! >> Estimated runtime memory use: 2116435 bytes >> >> It creates a dictionary for Vim, but when doing :spelldump to see >> words in the created dictionay, I see a lot of junk (words beginning >> with 0, words with /= at the end for example) so Vim does not >> understand Hunspell files. >> >> ===================== >> # file: /home/pel/.vim/spell/fr-moderne.utf-8.spl >> 0ampère >> 0becquerel >> 0calorie >> ...snip... >> µm/= >> µmol/= >> µs/= >> µvar/= >> µΩ/= >> Å/= >> Épinay-sur-Seine >> États-Unis >> Île-de-France >> Île-du-Prince-Édouard >> Îles-de-la-Madeleine >> Ω/= >> ===================== >> >> The help file spell.txt has notes about WORDCHARS, KEY, BREAK >> which don't seem essentials but there is no note about ICONV >> and OCONV in Vim's help. I see some doc here: >> http://manpages.ubuntu.com/manpages/lucid/man4/hunspell.4.html > > Hunspell uses the same kind of files, but adds more options. Vim should > be able to use most of the Hunspell files, with some modifications. > > I don't know what the ICONV and OCONV items mean. > The page you refer to simply say input and output conversion, without > explaining what that means. It's a common problem for Hunspell that > it's largely undefined how it works. You may need to look at the source > code... > > For the dictionaries, it's usually best to get them from the OpenOffice > site, as that's what is downloaded automatically, thus should be kept > up-to-date.
Warning: this message uses Unicode characters. Yes, the Hunspell documentation is not very clear. From looking at the dictionary "fr-modern.aff", I looks like ICONV and OCONV define aliases for Unicode characters that are equivalent or similar enough to be equivalent. File "fr-modern.aff" contains: ICONV 32 ICONV à à ICONV â â ICONV ä ä ICONV é é ICONV è è ICONV ê ê ICONV ë ë ICONV î î ICONV ï ï ICONV ô ô ICONV ö ö ICONV ù ù ICONV û û ICONV ü ü ICONV ÿ ÿ ICONV ç ç ICONV À À ICONV   ICONV Ä Ä ICONV É É ICONV È È ICONV Ê Ê ICONV Ë Ë ICONV Î Î ICONV Ï Ï ICONV Ô Ô ICONV Ö Ö ICONV Ù Ù ICONV Û Û ICONV Ü Ü ICONV Ÿ Ÿ ICONV Ç Ç OCONV 1 OCONV ' ’ The first line with ICONV (resp. OCONV) is followed by a number indicating the number of ICONV entries (resp. OCONV). Not sure how essential it is to support. I don't think it explains the odd words I see with ":spelldump". The first few incorrect words given by ":spelldump" are units: 0ampère 0becquerel 0calorie (etc) They appear like this in the "fr-modern.dic" file (http://www.dicollecte.org/download.php?prj=fr): ampère/Um() becquerel/Um() calorie/Um() And in fr-modern.aff file, I see: NEEDAFFIX () PFX Um Y 29 PFX Um 0 0/S. . PFX Um 0 l' [aàâeèéêiîoôuyœæ] PFX Um 0 d'/S. [aàâeèéêiîoôuyœæ] PFX Um 0 yotta/S. . PFX Um 0 zetta/S. . PFX Um 0 exa/S. . PFX Um 0 l'exa . PFX Um 0 d'exa/S. . PFX Um 0 peta/S. . PFX Um 0 téra/S. . PFX Um 0 giga/S. . PFX Um 0 méga/S. . PFX Um 0 kilo/S. . PFX Um 0 hecto/S. . PFX Um 0 l'hecto . PFX Um 0 d'hecto/S. . PFX Um 0 déca/S. . PFX Um 0 déci/S. . PFX Um 0 centi/S. . PFX Um 0 milli/S. . PFX Um 0 micro/S. . PFX Um 0 nano/S. . PFX Um 0 pico/S. . PFX Um 0 femto/S. . PFX Um 0 atto/S. . PFX Um 0 l'atto . PFX Um 0 d'atto/S. . PFX Um 0 zepto/S. . PFX Um 0 yocto/S. . I wonder why there is an entry "PFX Um 0 0/S. ." This is causing the weird words "0ampère", "0becquerel", "0calorie" (etc. for many other units). I see that the word "ampère" does not exist in ":spelldump" without prefix (it should be there). The entry "PFX Um 0 0/S. ." must have a special meaning (such as: empty prefix) which is misinterpreted by Vim. But the doc is certainly quite unclear to me: http://sourceforge.net/projects/hunspell/files/Hunspell/Documentation/ Regards -- Dominique -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
