Hi Craig,
> How about, as a compromise, we emulate the java.util.ResourceBundle family
and
> have it accept input information either way (properties files or XML
files)?
> This would be pretty simple to implement.
>
> The reason for this is that I've heard just as many complaints about the
> property file format, especially in having to remember to run native2ascii
on
> them -- to say nothing of some translators that screw up the format with
their
> text editors.
The proposed solution seems to me that you want to replace "native2ascii" by
a "property2xml" processor (and don't forget to write the "xml2property"
corresponding to the already existing "ascii2native").
I don't know which platform the translators you work with are using, but
finding a text editor using a local character set is still easier than a
Unicode, or an XML one... And remember that the majority of translators
don't use high-end workstations.
Do you think translators will use XML editors? They will just continue using
their word processor and screw up your XML file. And finding a forgotten /
character to close a tag can have much more impact and be time losing that
forgetting to run native2ascii.
You suggested in another post to run a script to "concatenate" (or include)
different translated xml files. Don't you think that this solution is better
applied to the native2ascii problem? And I can tell you that concatenating
property files is a lot easier than concatenating XML fragment files ;-)
What are the present benefits of Message Bundles (that's not Struts
specific)?
- They use a simple syntax, and you can explain it in 2 minutes to a new
translator.
- This file format/syntax is used by many other softwares, even tomcat!;-)
(they are in the place from many years...). So generally the 2 minutes
briefing of the translator is not necessary...
- Information is not context sensitive. You can decide how to order the
elements in the file, either lexicographically by key (like a dictionnary),
or by module/class access, or first-in-first-out to track changes.
- They must contain ISO-8859-1 text, so you can manage them with the
majority of tools presently available, even in old databases with ASCII-7
support.
- The "native2ascii" and "ascii2native" tools allows loss-less convertion
form native format to ISO-8859-1.
- They can handle *big* block of text, and you don't have to manage special
characters (like ", <, > or />). Due to their simple syntax, breaking a line
rarely introduce non wanted blank space.
- They are small (ie they don't introduce redundant non usefull
information).
Apart from the serialization problem on *certain* servlet containers, I can
add:
- They support a well defined naming, so you don't need to open the file to
know the language used.
- Based on that file naming, there's a well defined path to propose to the
user a language corresponding to his request
(http://java.sun.com/j2se/1.3/docs/api/java/util/ResourceBundle.html).
At the present time, localization data for applications don't need to
associate other information attached to each message data. At the speed
application and pages are changing on the Web, translators don't need to
keep comments or references attached with each message translation (and in
the case they would, they can add #comments).
A problem with property files in "exotic" languages is that you can't
distinguish content between russian and corean, if you look either at the
local characters file encoding or the "native2ascii" processed file. But
your proposal doesn't give a solution for that problem either. And with the
property file naming, you can have some hints on the language used...
That's the reason I think your proposal of XML for Struts localized string
is useless. I would have a different opinion if we need to exchange
translated document (not data!) between translators research centers or
offices, and manage bibliographical references and annotations, where XML
(and of course SGML) would have a role to play. But not here...
What are the benefits of your suggested syntax?
Wait a few hours that people in non-US time zone, in countries where locale
management is a real need, have a look at it, and I think my vehement
reaction will not be alone.
Now, to be constructive on the subject.
I don't think Struts has a need for 2 message bundles file formats. But what
Struts needs is
(1) to solve the problem with serialization (and I don't think it's
ResourceBundle specific, so it can appear with other classes, in other
applications, on other servers...)
(2) offer a better support for internationalization.
Ways to solve (1):
- (General) How BEA suggests to solve it when you need to share an object,
on the source of which you don't have control?
- (Future general) Amend the servlet specification document to completely
specify the accepted behavior. And what about Java2 1.4 for ResourceBundle?
- (Local to Struts) Write a MessageBundle class that supports the
Serializable interface, or less clean (but that's a problem where not
everybody is concerned) don't cache messages, so you don't need to have an
application scope object. We can add an option attribute to activate this
behavior for the BEA users meanwhile ;-)
Ways to solve (2)
- Add localization information in the struts-config.xml file. If one wants
to overide ResourceBundle file search, we could have your proposed syntax
with adapted rules.
<struts-messages default="en_US">
<locale locale="en_US"
xml:lang="en-US"
file="MyApp.en" />
<locale locale="es"
xml:lang="es"
file="MyApp.es" />
</struts-messages>
- That way, an application can ask the configuration which languages are
supported. You can write a <selectLanguage /> tag that lists the languages
and change the default locale for the user.
I'm in a hurry. More comments tonight.
Pierre M�tras