El jue, 22 sept 2022 a las 21:30, Soren Stoutner (<so...@stoutner.com>) escribió: > > On Thursday, September 22, 2022 9:20:46 AM MST Agustin Martin wrote: > > > First of all, I am curious about the reasons behind this new format, > > the problems it deals with and its advantages. I assume they are valid > > enough, but they imply yet another spellchecking engine/format. We > > currently have goog old ispell, aspell and hunspell. vim has its own > > spellchecker engine using its own format, with dicts that can be > > created from old myspell2 dicts. We did not add vim format dicts (from > > aspell dicts sources) since there seems to be some work to make vim > > use hunspell directly. And now these bdict dicts. > > The .bdic format is specified by the upstream Chromium project, and is > required by anything that is based off of Chromium's code, like Qt WebEngine. > I do not know why they went with a proprietary binary format, but I would > assume that if they went to so much trouble to not use the standard Hunspell > format there must have been something to make it worthwhile, like some > performance improvement. Perhaps I am giving Google too much credit for > having logical reasons instead of making arbitrary decisions.
Hi, Soren It s a pity not to have more info about the reasons for this new format. Even if using it is more effficient in terms of plain performance, I do not think that is noticeable in stuff like chromium. Another question is whether that bdic format is expected to change or that is very unlikely. Thinking about this, I have done some tests about these bdic files being generated at postinst, like emacs byte-compiled files (although my tests were more rude), delegating everything to the qtwebengine packages. . bdic generation is not very slow, but IMHO is not fast enough to go this way (which woud require moving qwebengine_convert_dic to Qt WebEngine runtime package and control everything from it). One noticeable thing is that bdic generation failed for some hunspell dicts I have installed ++ processing an_ES.aff [1003/125813.760330:FATAL:aff_reader.cc(305)] Did not find a space in 'y i'. Trace/breakpoint trap ++ processing ar.aff [1003/125813.796753:FATAL:aff_reader.cc(123)] We don't support the IGNORE command yet. This would change how we would insert things in our lookup table. ++ processing gl_ES.aff gl_ES.dic_delta not found. Reading gl_ES.aff Reading gl_ES.dic Serializing... Verifying... Word does not match! Index: 2126 Expected: Abū po:antropónimo is:ngrama_Abū_ʿAbdullāh_Muḥammad_ibn_Jābir_ibn_Sinān_ar_Raqqī_al_Ḥarrani_aṣ_Ṣabiʾ_al_Battānī Actual: Abū po:antropónimo is:ngrama_Abū_ʿAbdullāh_Muḥammad_ibn_Jābir_ibn_Sinān_ar_Raqqī_al_Ḥarrani_aṣ_Ṣabiʾ_al_Battā ERROR converting, the dictionary does not check out OK. Regards, -- Agustin