Hi Vlad and Paul, This is some good news on the ispell front. I had all-but-given-up on ispell working for us long-term. Now it seems that it might be at least feasible again. I have a RFP that's really simple to implement for whomever wants it: We need to abandon "american.hash" - we need something more robust. What I think we want is en_US.hash, de_DE.hash, etc... If we do this, we can dynamically load dictionaries based on our current locale or even with the "lang" attribute like my hack last night. So I guess my suggested plan of action is this: 1) Rename the dictionaries (and start housing (*not necessarily shipping*) some known working ones on the website) 2) And Either: a) Change ispell's SpellCheckInit() function to take a string of the form 'en_US' and have *it* create the proper .hash name so we can share 100% code with Pspell b) Keep passing the full path to the dictionary, and have that 1 ifdef in our code for ispell/pspell Whaddya think? Dom >From: Paul Rohr <[EMAIL PROTECTED]> >To: Vlad Harchev <[EMAIL PROTECTED]>, [EMAIL PROTECTED] >Subject: Re: problem seems to be solved: unreadable .hash files >(dictionaries) >Date: Fri, 16 Mar 2001 14:13:08 -0800 > >Vlad, > >Thanks for the detective work! > >At 02:02 PM 3/16/01 +0400, Vlad Harchev wrote: > > I remember a lot of people complained that AW can't use some hash files >(i.e. > >dictionaries for ispell) - that ispell module spits out some message >about > >incorrect header.. > > While helping other people to select a russian dictionary, I discovered >that > >'file' utility knows ispell format (at least on my RH6.0) and that we can > >judge whether the hash file will be loadable by ispell module or not >basing on > >the output of 'file' command. For example, here is an output for the > >russian.hash file that can be used by AW's ispell: > > > >[hvv@h dictionary]$ file russian.hash > >russian.hash: little endian ispell 3.1 hash file, 8-bit, capitalization, >26 > >flags and 100 string characters > >[hvv@h dictionary]$ > > > > It seems that hash files for which '7-bit' is mentioned in the output of > >'file' command can't be used by AW. > >Bingo. That's it. If you grep the sources for NO8BIT, you'll see that one >of the few things it affects is SET_SIZE, which in turn controls the size >of >various ispell structs, inclung the main hashtable. > > http://www.abisource.com/lxr/source/abi/src/other/spell/ispell.h#495 > >The error message we usually get is a sanity check to make sure that >ispell's not reading a hashtable of the wrong length. For example, see: > > http://bugzilla.abisource.com/show_bug.cgi?id=902 > http://bugzilla.abisource.com/show_bug.cgi?id=824 > >Note that the hashtable loader currently just reads the entire struct from >disk to memory here: > > http://www.abisource.com/lxr/source/abi/src/other/spell/lookup.c#159 > >Gag. Methinks it would be prudent to just rewrite the loader to do the >math >to detect this situation and do the extra work needed to try and load 7-bit >content into the 8-bit structs we currently use. > > >Also it turns out that (at least for > >russian dictionary) it's possible to specify whether to use 7-bit or >8-bit > >format of hash files by altering Makefile for dictionary (there are >makefile > >variables that control that). So, it seems we have a hope of knowing te >way of > >building ispell dictionaries that will be understood by our ispell. At >least > >we may try to build .hash files for languages for which only unreadable >by >our > >iconv compiled dictionaries are available.. > >Exactly. Until someone's willing to write the code mentioned above to also >load 7-bit dictionaries, we now have a few simple workarounds: > > - update the FAQ to tell folks not to use 7-bit dictionaries > - ideally, point them to 8-bit alternatives > >Any volunteers? ;-) > >Paul > _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com
