Hi all! > OS: Ubuntu 7.04, OOo: 2.2.0-1ubuntu-3 > > This happened with a Project Gutenberg Distributed Proofreaders file. > The text file opened with 'Western Europe (ISO-8859-1)' Character set is > giving proper accent marks and diacritical marks. While spell-checking I > added those words to a special personal dictionary 'dd.dic'. Now, when I > am opening this 'dd.dic' from '/home/dd/.openoffice.org2/user/wordbook/' > I am getting a mixed up and mangled text: some of the accents and marks > are showing, that too in a mixed up way, and some are not showing at > all. Some of the words cannot be even recognized. I tried all the > Character Sets, obviously starting from the 'Western Europe > (ISO-8859-1)'. > > If anyone needs I can send the relevant files or the screen-shots. > > Can anyone help about the file format of the '.dic' files?
If you actually mean those files that you can find in user/wordbook AND edit via "Tools/Options/Language Settings/Writing Aids" it is somewhat complicated since it is actually binary format. The strings for the words itself though are UTF-8 encoded. Since the format also needs to takes care of several different file format versions over the past 12+ years the best complete documentation would be the code. There are more than 5 different file format versions support by now. The latest change was a patch provided by Michael Meeks to allow for a tagged version of the file format to be read. It is probably the best to use for you. Have a look in the respective issue: http://www.openoffice.org/issues/show_bug.cgi?id=60698 There are also sample dictionaries attached. Download them and see how the tagged file format looks like. An existing dictionary is usually written in the very same file format version it is found when loading it. Thus you should be fine with creating a tagged version and later on editing it via the UI. If you need to know more details you have to look at the source code http://sw.openoffice.org/source/browse/*checkout*/sw/linguistic/source/dicimp.cxx?rev=1.22 Look for the DictionaryNeo::loadEntries function. The code for the tagged file format is the one where "nDicVersion" is set to 7. Please note though that these .dic files are completely different from the files the HunSpell implementation uses! Regards, Thomas --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
