On 19.06.2012 05:07, Ariel Constenla-Haile wrote:
Hi there,there have been some reports of users complaining that the Thesaurus does not work. The root of the issue is in the dictionary extensions we are shipping: two of them collide due to lack of uniqueness in the configuration node name, namely dict-en.oxt (the generic EN dictionary) and dict-en-nz-2008-12-03.oxt. The conflict happens on the Thesaurus node: * dict-en.oxt: <node oor:name="ThesDic_en-US" oor:op="fuse"> <prop oor:name="Locations" oor:type="oor:string-list"> <value>%origin%/th_en_US_v2.dat</value> </prop> <prop oor:name="Format" oor:type="xs:string"> <value>DICT_THES</value> </prop> <prop oor:name="Locales" oor:type="oor:string-list"> <value>en-GB en-US en-ZA en-AU en-CA</value> </prop> </node> * dict-en-nz-2008-12-03.oxt: <node oor:name="ThesDic_en-US" oor:op="fuse"> <prop oor:name="Locations" oor:type="oor:string-list"> <value>%origin%/th_en_US_v2.dat</value> </prop> <prop oor:name="Format" oor:type="xs:string"> <value>DICT_THES</value> </prop> <prop oor:name="Locales" oor:type="oor:string-list"> <value>en-NZ</value> </prop> </node> As you see, they have the same name, "ThesDic_en-US", despite the fact that the official documentation states clearly that dictionary extension developers should use a unique node name, see http://wiki.services.openoffice.org/wiki/Extension_Dictionaries#Dictionary_entries_.28must_be_provided.29 specially "About node names for the dictionaries".
The thesaurus file in dict-en-au-2008-12-15 did rename the thesaurus file to th_en_AU_v2.dat. That avoids the conflict but still wastes 18MB of disk space.
I didn't research what the fuse operation is *supposed* to do there (it's applied to the node, not to the properties), but the documentation is clear in stating that the node name must be unique. And the result is that the properties are not fused but replaced, having as effect that the en-NZ dictionary installed disables the thesaurus for en-US. As this bug has its root in the dictionary extensions, the only thing we can do to fix it is just provide only one extension, in this case dict-en.oxt.
Dropping the other english dictionaries is a good idea for other reasons, too. Issue 119272 (https://issues.apache.org/ooo/show_bug.cgi?id=119272) describes the problem of all dictionaries using more than 160MB, most of this are the large thesaurus files. Including only one english dictionary would reduce this number considerably. Besides, it contains support for most variants of English anyway.
-Andre
Note that I only discovered this bug in the English dictionary extensions, I didn't check other languages, but we should do so in the cases where we're providing more than one dictionary extension. Regards
