Vlad, Thanks for the detective work! At 02:02 PM 3/16/01 +0400, Vlad Harchev wrote: > I remember a lot of people complained that AW can't use some hash files (i.e. >dictionaries for ispell) - that ispell module spits out some message about >incorrect header.. > While helping other people to select a russian dictionary, I discovered that >'file' utility knows ispell format (at least on my RH6.0) and that we can >judge whether the hash file will be loadable by ispell module or not basing on >the output of 'file' command. For example, here is an output for the >russian.hash file that can be used by AW's ispell: > >[hvv@h dictionary]$ file russian.hash >russian.hash: little endian ispell 3.1 hash file, 8-bit, capitalization, 26 >flags and 100 string characters >[hvv@h dictionary]$ > > It seems that hash files for which '7-bit' is mentioned in the output of >'file' command can't be used by AW. Bingo. That's it. If you grep the sources for NO8BIT, you'll see that one of the few things it affects is SET_SIZE, which in turn controls the size of various ispell structs, inclung the main hashtable. http://www.abisource.com/lxr/source/abi/src/other/spell/ispell.h#495 The error message we usually get is a sanity check to make sure that ispell's not reading a hashtable of the wrong length. For example, see: http://bugzilla.abisource.com/show_bug.cgi?id=902 http://bugzilla.abisource.com/show_bug.cgi?id=824 Note that the hashtable loader currently just reads the entire struct from disk to memory here: http://www.abisource.com/lxr/source/abi/src/other/spell/lookup.c#159 Gag. Methinks it would be prudent to just rewrite the loader to do the math to detect this situation and do the extra work needed to try and load 7-bit content into the 8-bit structs we currently use. >Also it turns out that (at least for >russian dictionary) it's possible to specify whether to use 7-bit or 8-bit >format of hash files by altering Makefile for dictionary (there are makefile >variables that control that). So, it seems we have a hope of knowing te way of >building ispell dictionaries that will be understood by our ispell. At least >we may try to build .hash files for languages for which only unreadable by our >iconv compiled dictionaries are available.. Exactly. Until someone's willing to write the code mentioned above to also load 7-bit dictionaries, we now have a few simple workarounds: - update the FAQ to tell folks not to use 7-bit dictionaries - ideally, point them to 8-bit alternatives Any volunteers? ;-) Paul
