Thank you for your rant. :-) While I don't enjoy learning that
something must be fixed, I do appreciate a well-thought-out complaint.
I am too often dismayed to learn that someone has long been suffering
in silence a problem that is easily corrected.

There is much to think about here, and I probably won't touch every
point.  I'm going to try to explain how things are done, as a starting
point only.  I do think we ought to be able to make localization easier.

First:  DSpace is using localization facilities provided by the
underlying software

o  JSPUI uses Java's native PropertyResourceBundle class and is bound
   by its behavior.  See
     
http://download.oracle.com/javase/6/docs/api/java/util/ResourceBundle.html#getBundle(java.lang.String,%20java.util.Locale,%20java.lang.ClassLoader)
   for the gory details of how a particular localization is selected
   from those available.

o  XMLUI uses the Cocoon I18nTransformer class, which follows a search
   pattern similar to that of ResourceBundle, except that it does not
   search for .class files.  It's not immediately clear to me why
   someone invented an XML profile which duplicates property files,
   but that's the way I18nTransformer was made.  Cocoon documentation
   is in a sorry state, and I don't know of a good link for this.
   _Cocoon Developer's Handbook_ (Moczar, Aston 2003) p. 303-4,
   "Configuring Message Catalogs", describes it a bit.

o  The commandline tools use PropertyResourceBundle, but have a
   different classpath than JSPUI and so may have access to a
   different set of resources.

o  I suppose that JSPUI's messages are in dspace-api.jar due to
   historical reasons:  when there was only one UI, it made sense to
   put the messages all in one place.

Anyway:  the behavior you are seeing comes from the supporting
software.  Every message catalog has a "parent" catalog which is the
next less-specific locale -- fr_CA has fr as its parent, for example,
and fr has "" (e.g. messages.xml) as its parent.  If a given key is
not found in the most-specific existing catalog, it is searched for by
going up the chain of parent catalogs.  So, if key X is sought, and it
is found in "messages_de", "messages" will not be consulted.  To make
an alternation in the smallest number of places, you need to find all
of the most-specific catalogs which define that key within the set of
locales for which you wish to modify the text.

Your example of keys in messages_en.xml being preferred over those in
messages.xml (if the user's request is in an English locale)
demonstrates this.  Assume that the user's request is in the
en_GB_Cockney locale.  "en" is more specific than "", so if the key
exists in "en" then it will be used.  If there were an "en_GB"
containing the key , it would use that text, and if there were an
"en_GB_Cockney" containing the key then it would prefer that text.

DSpace is not doing any of this; it's done by the JRE or by Cocoon.
(That doesn't excuse us from trying to avoid making things even more
complex and difficult, or documenting well the complexities required
by our choices.)

A number of DSpace's components have their own catalogs.  Expect to
see more of this -- there is activity to loosen the coupling among
components to the point that they can be released on separate
schedules, and this will be facilitated by providing for a separate
catalog for each component.

I seem to recall that there is a way to configure XMLUI's default
request locale, but it's different from JSPUI's way.  I don't know the
details.  Apparently we could do a better job of documenting it.

Defaulting the request locale is yet another dimension of the
complexity of localization.  What this defaulting does is to prevent
ever *starting* at the "" locale.  Any user who does not specify a
locale gets the default, so if your site is set to insert a default
"de" locale then that is where such requests start.  DSpace could
still search down to "" if the key isn't in "de".  "messages.xml"
isn't a default catalog so much as it is a catch-all to try before
giving up and presenting the key itself instead of a message text.

It's important to recall that the thing being looked up is a specific
key.  A given request is associated with a locale which is tried and
then repeatedly broadened *for each message key presented*.  The
localization mechanism will search the whole path each time a message
text is wanted, until it finds one or runs out of places to look.

[a rant of my own]

It's my thought that we need to ensure that there is a place, or a
well-defined and well-documented sequence of places, which appear
early in the classpath for *every* application within DSpace, into
which one may put overriding versions of message catalogs for *any or
all* DSpace components.  Message texts have no DSpace-defined
behavior so it should not be necessary to rebuild or even reassemble
any part of DSpace in order to provide additional localizations or
site-specific rewording of any message.  One should be able to simply
drop jspui-messages_en_GB.properties or discovery-messages_de.xml
into a directory and have it preferred when "en_GB" resp. "de" is the
best match to the requested locale.  This would require some rework to
ensure that each component which has its own catalog will always ask
for it by a name unique to that component.

Come to think of it, we don't really need to pack the catalogs into
JARs at all if we arrange the classpath well.  There are data which
are mainly for developers to tweak, which can (and should?) be
squirreled away where only developers can tinker with them, but it
seems to me that message texts should be exposed for easy
customization by each site.  Shouldn't they all just go into:

 [DSpace]
   /config
      /messages
         /jspui.properties
         /xmlui.xml
         /discovery.xml
         /api.properties
            .
            .
            .

-- 
Mark H. Wood, Lead System Programmer   [email protected]
Asking whether markets are efficient is like asking whether people are smart.

Attachment: pgp77z0w0o9Hr.pgp
Description: PGP signature

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to