Re: L10n / I18n

2018-11-10 Thread Woonsan Ko
On Thu, Nov 8, 2018 at 5:30 PM Emmanuel Bourg  wrote:
>
> Le 08/11/2018 à 15:47, Mark Thomas a écrit :
>
> > is that a reason to drop attempts to provide i10n or
> > is it an indication we aren't doing nearly enough?
>
> We can always do more, but considering our limited resources I think
> it's wise to focus first on the most important areas (ie. messages
> displayed in web pages vs internal log messages).

I think we should be focused on helping non-English speaking
volunteers so that can easily help the community and themselves with
translations. We don't have to do more by ourselves.
I used to contribute translated Java resource bundle files to some
projects such as Wicket, Jetspeed-2, ... I had to manually escape
unicode strings for my Korean translation resource files until I found
a nice GUI tool which escapes those automatically and shows an
integrated view for different languages, missing keys, etc.
If there's a good tool support somehow (perhaps like what Mark is
trying with) and we say we're willing to take nice translation
contributions, it could be a big help. I have not seen projects yet
doing that.

>
>
> > Seriously, we (well, those in the community that speak French
> > fluently - not me) could look to improve those.
>
> FTR I did start working on the French translation for jasper this summer
> but I got dragged to other duties. When looking around the other
> resource bundles I really wondered if it made sense to translate very
> technical messages though.

I don't personally think there is a big difference in general between
non-technical translations and tech-oriented translations once
somethings are translated.
For example, I translated the HelloWorldPortlet for Apache Portals
before. Event "Hello, World!" was not easy to translate. ;-) "안녕,
세계여!"
I see the point though: when very specific tech terms are translated
without understanding the architecture/design/tech context, it could
be misleading. I admit I did that many times before.
But I think it's about quality issue we can probably improve
collectively somehow -- like Wikipedia does -- with easy tooling and
support.
So, I like Mark's idea of option for admins to drop translation
messages somehow, switching to English version for example. If it is
easily doable, it could be a good option for both ends.

>
>
> > But that brings us back to your original question of whether the
> > translations are worth it. If (and it is a fairly big if) the
> > translations were mostly complete and mostly of good quality, would that
> > change your view? I'm thinking try and improve the translations as a
> > first step and, if things don't improve, then decide what to do next.
>
> If the translations were nearly complete and of good quality I would
> definitely not suggest dropping them. Yet, I think I'd still configure
> Tomcat to run in English instead of French because I'm used to the
> English terminology.

I've just searched on a Korean portal site (e.g, www.daum.net) with
Tomcat errors in Korean. I hit many. Someones help others interpreting
the English messages. All of them are developers. So, I see values
providing translated internal messages as well.

>
>
> > Removing the translations (apart from the UI) feels to be too big a step
> > to me. That said, I can see how they would be a hindrance rather than a
> > help to some. Perhaps separating the l10n JARs into user facing and
> > external would give more options. Admins could remove the translated JAR
> > for the internal messages and get those messages in English if they
> > prefer. Or we could ship less or even no translations for internal
> > messages by default and provide them as a separate download.
>
> I don't think rearranging the JARs is necessary, it isn't difficult to
> run Tomcat with a different locale. I'm more concerned about the
> maintenance burden and the actual value vs the time invested.

Changing system locale sometimes affects application's behavior,
whether it's right or not, in some cases.
If it's easy enough, rearranging ability might be more flexible and safer, IMHO.

Kind regards,

Woonsan

>
> Emmanuel Bourg
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: [OT] L10n / I18n

2018-11-10 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

I've tried to post to this thread a few times, but I never see the
messages arriving on the list. Moderated? They do include a two HTTP
URLs, but that hasn't seemed to stop messages in the past...

Thanks,
- -chris

On 11/8/18 17:30, Emmanuel Bourg wrote:
> Le 08/11/2018 à 15:47, Mark Thomas a écrit :
> 
>> is that a reason to drop attempts to provide i10n or is it an
>> indication we aren't doing nearly enough?
> 
> We can always do more, but considering our limited resources I
> think it's wise to focus first on the most important areas (ie.
> messages displayed in web pages vs internal log messages).
> 
> 
>> Seriously, we (well, those in the community that speak French 
>> fluently - not me) could look to improve those.
> 
> FTR I did start working on the French translation for jasper this
> summer but I got dragged to other duties. When looking around the
> other resource bundles I really wondered if it made sense to
> translate very technical messages though.
> 
> 
>> But that brings us back to your original question of whether the 
>> translations are worth it. If (and it is a fairly big if) the 
>> translations were mostly complete and mostly of good quality,
>> would that change your view? I'm thinking try and improve the
>> translations as a first step and, if things don't improve, then
>> decide what to do next.
> 
> If the translations were nearly complete and of good quality I
> would definitely not suggest dropping them. Yet, I think I'd still
> configure Tomcat to run in English instead of French because I'm
> used to the English terminology.
> 
> 
>> Removing the translations (apart from the UI) feels to be too big
>> a step to me. That said, I can see how they would be a hindrance
>> rather than a help to some. Perhaps separating the l10n JARs into
>> user facing and external would give more options. Admins could
>> remove the translated JAR for the internal messages and get those
>> messages in English if they prefer. Or we could ship less or even
>> no translations for internal messages by default and provide them
>> as a separate download.
> 
> I don't think rearranging the JARs is necessary, it isn't difficult
> to run Tomcat with a different locale. I'm more concerned about
> the maintenance burden and the actual value vs the time invested.
> 
> Emmanuel Bourg
> 
> -
>
> 
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: dev-h...@tomcat.apache.org
> 
-BEGIN PGP SIGNATURE-
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlvnNzgACgkQHPApP6U8
pFiZLxAAoNA7Hi9366g/iivTrAMsL5rwPkO2TlMq22U+X8w3ZzpsY6Hvl5LNeFB+
DqdFyusn4zr7dDHpKrlP7C8y+YajRY78VPuMgVSNeBr9JS3bubJ758fCmtazfmdz
PTFMt350dPWzJ88ptwHvKuu/1FLpyrDbglcGcscjrdNga4/tU9j49FClXulZXbgE
czrCIzzcPCcm9j4So+08wA56BHT/V/r9PDm4Fu55U04BdOeoo9VhCQKPpjrTnSOx
KIIe9i6hei77x4AS73mDZDIHBqqlBDdKcSSjwbC9odPEdKfuXybS0i9E+KgQeLx2
jsBv+Iz/mwWmjVrQ346B8pniMmda7eD+xnUl+1DTIh4cFEE9tbQyBTwD3LrjZ56y
2Pxk9w3uECRWuAppVfKp2d+4dMBRbDKPSarlgQmYcrudLiX4iNsQZnMXxAg/P9yL
BU96DyRD90ruSB+XXZE5RyvzWbuKfHXbH6b2EnwzgE+LGG18AgMbJibHK87PTBaO
wq38K+1VOxUwUR0G0RtQ20/Rzjk+ZfI7+L3GM7blkniq0nkxvJcX9kL/DakCEeZF
WVdOYazJi7Yn1pi5puIQMeFdS6WTX/PqHXojuGwFd9Aqhksb2btKv9QAgv3t0cHH
j8S0KjFLY/RckmFgfNC02n7fQILRZneV5GBhNEU0PYBrHLtNyZ4=
=yww3
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: L10n / I18n

2018-11-08 Thread Emmanuel Bourg
Le 08/11/2018 à 15:47, Mark Thomas a écrit :

> is that a reason to drop attempts to provide i10n or
> is it an indication we aren't doing nearly enough?

We can always do more, but considering our limited resources I think
it's wise to focus first on the most important areas (ie. messages
displayed in web pages vs internal log messages).


> Seriously, we (well, those in the community that speak French
> fluently - not me) could look to improve those.

FTR I did start working on the French translation for jasper this summer
but I got dragged to other duties. When looking around the other
resource bundles I really wondered if it made sense to translate very
technical messages though.


> But that brings us back to your original question of whether the
> translations are worth it. If (and it is a fairly big if) the
> translations were mostly complete and mostly of good quality, would that
> change your view? I'm thinking try and improve the translations as a
> first step and, if things don't improve, then decide what to do next.

If the translations were nearly complete and of good quality I would
definitely not suggest dropping them. Yet, I think I'd still configure
Tomcat to run in English instead of French because I'm used to the
English terminology.


> Removing the translations (apart from the UI) feels to be too big a step
> to me. That said, I can see how they would be a hindrance rather than a
> help to some. Perhaps separating the l10n JARs into user facing and
> external would give more options. Admins could remove the translated JAR
> for the internal messages and get those messages in English if they
> prefer. Or we could ship less or even no translations for internal
> messages by default and provide them as a separate download.

I don't think rearranging the JARs is necessary, it isn't difficult to
run Tomcat with a different locale. I'm more concerned about the
maintenance burden and the actual value vs the time invested.

Emmanuel Bourg

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: L10n / I18n

2018-11-08 Thread Rémy Maucherat
On Thu, Nov 8, 2018 at 11:00 PM Marek Czernek  wrote:

> On 11/8/18 1:16 AM, Emmanuel Bourg wrote:
> > Le 07/11/2018 à 23:36, Mark Thomas a écrit :
> >
> >> WDYT?
> > What about simplifying the issue by dropping the translations of the
> > internal messages and retaining only the user facing messages (things
> > like HTTP error messages that can appear in a normal request) ?
>
> +1 to this. Not sure if I'm the target audience, but I personally get
> quite annoyed at translated error messages into my native language
> (which is not English). When I get a non-English error message, I have to:
>
>   * Think about what it means (because this is so non-standard in things
> like web-browsers)
>   * Translate it into English to the best of my abilities
>   * Search google using my sub-optimal translation, hoping I hit the
> nail on the head, and land in the correct SFO/forum answer
>
> Granted, this is a bit easier with Tomcat - I can grep the source for
> the non-English message, get the key, and grep the key for the full
> English original. Still, that's pretty annoying to me.
>

If you really dislike it, you can also delete your resource bundle from
lib.

>
> Maybe this is different for widely-spoken languages, like Chinese, or
> Spanish, where people might have their own language variant of Stack
> Overflow, and similar forums...
>

Rémy


Re: L10n / I18n

2018-11-08 Thread Marek Czernek

On 11/8/18 1:16 AM, Emmanuel Bourg wrote:

Le 07/11/2018 à 23:36, Mark Thomas a écrit :


WDYT?

What about simplifying the issue by dropping the translations of the
internal messages and retaining only the user facing messages (things
like HTTP error messages that can appear in a normal request) ?


+1 to this. Not sure if I'm the target audience, but I personally get 
quite annoyed at translated error messages into my native language 
(which is not English). When I get a non-English error message, I have to:


 * Think about what it means (because this is so non-standard in things
   like web-browsers)
 * Translate it into English to the best of my abilities
 * Search google using my sub-optimal translation, hoping I hit the
   nail on the head, and land in the correct SFO/forum answer

Granted, this is a bit easier with Tomcat - I can grep the source for 
the non-English message, get the key, and grep the key for the full 
English original. Still, that's pretty annoying to me.


Maybe this is different for widely-spoken languages, like Chinese, or 
Spanish, where people might have their own language variant of Stack 
Overflow, and similar forums...




I think it's worth considering because:
- The target audience of Tomcat is mainly developers and administrators
which are used to read English text.
- The coverage of the translations is rather low.
- Maintaining the translations, the quality and the consistency is
difficult and time consuming.
- Sometimes the translation of the technical terms are a bit unusual and
not as clear as the English counterpart. For example in French it isn't
obvious that "gestionnaire de protocole" relates to ProtocolHandler
which is an internal Tomcat concept. Other translations are even funny
like "enrobeur de conteneur" for "wrapper container" (a pastry concept
applied to a freight container?). This issue is so common with the
French translation that many messages carry the English terms in
parentheses to clarify the meaning.

Emmanuel Bourg

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org


--

Marek Czernek

JWS/JBCS Associate Quality Engineer, RHCA



Re: L10n / I18n

2018-11-08 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mark,

On 11/7/18 5:36 PM, Mark Thomas wrote:
> Hi,
> 
> After looking at bug 62843, I got thinking about tools to help
> manage translations from contributors. Something that would show
> the key, the original value and the translated value side by side.
> 
> I looked at Pootle but that was more run it yourself. My preference
> was for something that was hosted. I then looked at POEditor
> (poeditor.com) and that seemed to fit the bill.
> 
> It seems that these tools all expect a single file per language (at
> leat the two I have looked at so far do) so I wrote some code to
> merge the LocalString.properties files into a single file per
> language (I prefixed the keys with the package name to ensure they
> remained unique).
> 
> Having uploaded these, the tool identified ~20 keys that existed
> in translations but not in the original. Hence the handful of
> commits this afternoon cleaning those up.
> 
> What we are left with is the following:
> 
> French 18% German  2% Japanese   21% Portuguese  1% Russian
> 8% Spanish42%
> 
> (% is the number of keys translated into that language)
> 
> POEditor offer free unlimited plans for OSI approved licensed
> software (that includes Tomcat).
> 
> What I would like to do is announce this on the users list and
> invite contributors to start adding translations - potentially for
> new languages.
> 
> However, there is a catch. How to get the translations back into
> Tomcat? I'll need to write some more code to do this - that isn't
> an issue. The issue is that retaining the current comments and
> ordering in the translated files would be a LOT of work. It would
> be a lot easier if I could just write the keys out in alphabetical
> order with each block (determined by the key value up to the first
> period) separated by a blank line. Would that be acceptable?
> 
> If it is, I'll clean the English files up by hand so that the
> comments are retained for those files.
> 
> WDYT?

It's an offline tool, but I've always used Attesoro[1] for this
purpose. Completely free with full source code.

It handles comments, key-ordering, etc.

- -chris

[1] https://attesoro.org/ and
https://github.com/stephenostermiller/attesoro
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlvjggAACgkQHPApP6U8
pFheDQ/6AwaZg+M9oipgdn0wFpLvpQQLFfGF/jekzw8P5Amhrue808+lR6rvWNqk
YX+TFU8Ur0rj7IzFBGLny+pRS8UXELwvpoYF0yrm5lkmquYU/zhVaXSD3YyvBPCN
egxq1rigSQV653WpERvVvgECDSuC7uef4CKANwveQTzuLFfToCbBg7Sp1zdhpVGw
scF+J/P9wl/MJbIbopuIk21N/gxJvcRuYnX5lfrqr9WPBrN/GSniESisQ62lEvJx
dAg+D+1MmqOl4lSI0obpTphhbcsaZrl++GKwhvBSG+jaHTqvmQ6jZHveTGNkFEWD
J5nwmRzch2+gLzhDXXxPJLPbd7jP8vWaAsE8I5jdom0oGJOD0gCV2h1RcWMiQL1X
WRpisOFzeqtm8xmCdrQEfZqpHiEMEAOSEC43k4vwtQbIO9NPY61m/MEtobDjWnEx
SijQeHlS6A65Da+NY64539oXMo+nYMZBWDM3I8Nd64krNqvHU6OsKEZKAsaknK1B
YkIREvQXkrDk13WI3oiShfHysug128NuKfgIcVIBHhE5K4UZbCci6CjmNut9zD5I
5HHF/pO8YbitoQKrFQ53NhlnOwRM+HUcs66eRkSZqnhwCsurk5j2y3nocDCSwajQ
WzYQtTJED2G2x3hciK+UEOTru3lyvO+hw3iuFhIc16g7z6QoAxE=
=67dY
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: L10n / I18n

2018-11-08 Thread Mark Thomas

On 08/11/2018 15:32, Rémy Maucherat wrote:

On Thu, Nov 8, 2018 at 3:47 PM Mark Thomas  wrote:


On 08/11/2018 00:16, Emmanuel Bourg wrote:

Le 07/11/2018 à 23:36, Mark Thomas a écrit :


WDYT?


What about simplifying the issue by dropping the translations of the
internal messages and retaining only the user facing messages (things
like HTTP error messages that can appear in a normal request) ?


That was completely unexpected. Some serious food for thought here.



Same :D




I think it's worth considering because:
- The target audience of Tomcat is mainly developers and administrators
which are used to read English text.


My primary concern is that it could make Tomcat less accessible to
non-native Eglish speakers. Granted all of our documentation, nearly
every message on the mailing list, issue tracker, code comments etc are
all in English but is that a reason to drop attempts to provide i10n or
is it an indication we aren't doing nearly enough?

The irony of asking that question on a mailing list where messages are
in English hasn't escaped me.



Yes, the question is: is it ever going to be possible to use Tomcat for
someone who doesn't understand english at all (since the req level is
pretty low).

However, whatever happens, using a string bundle should remain mandatory.




- The coverage of the translations is rather low.


They tend to get done once at a point in time where they are close to
100% and then tail of as the code is refactored and new features added.


- Maintaining the translations, the quality and the consistency is
difficult and time consuming.


There has been very little of this. I recall a big donation of Spanish
translations, the recent Russian additions and then, apart from those,
the odd typo fix here and there. My hope was that, with a tool like
POEditor, it would be easier for contributors to improve and/or add to
the translations.


- Sometimes the translation of the technical terms are a bit unusual and
not as clear as the English counterpart. For example in French it isn't
obvious that "gestionnaire de protocole" relates to ProtocolHandler
which is an internal Tomcat concept. Other translations are even funny
like "enrobeur de conteneur" for "wrapper container" (a pastry concept
applied to a freight container?). This issue is so common with the
French translation that many messages carry the English terms in
parentheses to clarify the meaning.


If my French was a lot better, I might just start reading the
translations to enjoy the humour. Seriously, we (well, those in the
community that speak French fluently - not me) could look to improve
those. I think it might be better to not translate class/interface names
like "ProtocolHandler", "Realm" or maybe put the translations in
brackets. As for the shipping container wrapped in pastry, I assume a
better translation is possible.



It might sound funny, but the pastry thing is correct.



But that brings us back to your original question of whether the
translations are worth it. If (and it is a fairly big if) the
translations were mostly complete and mostly of good quality, would that
change your view? I'm thinking try and improve the translations as a
first step and, if things don't improve, then decide what to do next.



Ok for trying !


Just a reminder. My current plan for importing translated strings means 
that all the LocalStrings_xx.properties files (i.e. the translated files 
but not the English versions) will get re-written meaning:

- all comments will be lost
- all entries will be in alphabetical order
- groups (defined by the key value up to the first period) will be
  separated by a blank line

Mark



Removing the translations (apart from the UI) feels to be too big a step
to me. That said, I can see how they would be a hindrance rather than a
help to some. Perhaps separating the l10n JARs into user facing and
external would give more options. Admins could remove the translated JAR
for the internal messages and get those messages in English if they
prefer. Or we could ship less or even no translations for internal
messages by default and provide them as a separate download.



"apart from the UI", like the manager webapp ? It might sounds obvious, but
if Tomcat is not usable by someone who doesn't understand english, then it
will not add any value but only add confusing translations depending on the
language configured in the user browser.

Rémy



-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



L10n / I18n

2018-11-08 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Mark,

(Looks like this message wasn't delivered the first time)

On 11/7/18 5:36 PM, Mark Thomas wrote:
> Hi,
> 
> After looking at bug 62843, I got thinking about tools to help 
> manage translations from contributors. Something that would show 
> the key, the original value and the translated value side by side.
> 
> I looked at Pootle but that was more run it yourself. My
> preference was for something that was hosted. I then looked at
> POEditor (poeditor.com) and that seemed to fit the bill.
> 
> It seems that these tools all expect a single file per language
> (at leat the two I have looked at so far do) so I wrote some code
> to merge the LocalString.properties files into a single file per 
> language (I prefixed the keys with the package name to ensure they 
> remained unique).
> 
> Having uploaded these, the tool identified ~20 keys that existed in
> translations but not in the original. Hence the handful of commits
> this afternoon cleaning those up.
> 
> What we are left with is the following:
> 
> French 18% German  2% Japanese   21% Portuguese  1%
> Russian 8% Spanish42%
> 
> (% is the number of keys translated into that language)
> 
> POEditor offer free unlimited plans for OSI approved licensed 
> software (that includes Tomcat).
> 
> What I would like to do is announce this on the users list and 
> invite contributors to start adding translations - potentially for 
> new languages.
> 
> However, there is a catch. How to get the translations back into 
> Tomcat? I'll need to write some more code to do this - that isn't 
> an issue. The issue is that retaining the current comments and 
> ordering in the translated files would be a LOT of work. It would 
> be a lot easier if I could just write the keys out in alphabetical 
> order with each block (determined by the key value up to the first 
> period) separated by a blank line. Would that be acceptable?
> 
> If it is, I'll clean the English files up by hand so that the 
> comments are retained for those files.
> 
> WDYT?

It's an offline tool, but I've always used Attesoro[1] for this
purpose. Completely free with full source code.

It handles comments, key-ordering, etc.

- -chris

[1] https://attesoro.org/ and
https://github.com/stephenostermiller/attesoro
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlvkk5EACgkQHPApP6U8
pFhCVw//fxOwZTdY5swQo9WkM/poQJs2eGSglpJEkhW6MIS1lHRYYK39Ky0Xvb3o
8JItAb/ejoUGBGTj3JDnwpZXHQua4r965DsyCj+UdRUncrk+4SqhXdVidvv/uvrQ
c+Gu/onxD86m9C2mbmtsDxjgiAtcSBOy8jQerUJSakkAdx0Kpe+akaFLE3O153cg
ENVS9fjdaYIA5yYnDBqFcs9JJVG+5iBC6OaMaw0XaErgrEm2k/69wpIbDbpnAj9O
+Nnhdc8zB4BU/ulMFY6oACUoY1BeRw1mwhcj/Zq1y92AH2sZdB+OTJqa1BNwgIoV
uGsIIdV5cr5e4FpZ8GXwEMXf/xlW2tID+JAh1EVH3g88fJXh9GL0njD42gzYyVUf
4oH03+HutMtLrG3UFWp9xhzOO2RXiq22dcToW1xwBaMawevTqYjf3djtEpTexIjI
H5YqkfSlJqmy2C+mH/ch2YN/lq+f/zqOWTiwHqwwrQPlVY/zRCaNqNrXHuuA95+D
TKAVwuJ1gwJmbji4qLdHJwh6u+dpGrqWVla+2nFrv9jdIU/v6S5RCjXPOSDnBa5c
x80DvPWJpoakmhIKdovMif6Mi+tUeVEJlYm4K/XCB1Qbl3KOrojJfKFllN8hNjON
FCfE51VE2mGA0/WkkZyAFlW/YHV6p4t7aDJ+en53PZ5gUnAfel8=
=NXxR
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: L10n / I18n

2018-11-08 Thread Rémy Maucherat
On Thu, Nov 8, 2018 at 3:47 PM Mark Thomas  wrote:

> On 08/11/2018 00:16, Emmanuel Bourg wrote:
> > Le 07/11/2018 à 23:36, Mark Thomas a écrit :
> >
> >> WDYT?
> >
> > What about simplifying the issue by dropping the translations of the
> > internal messages and retaining only the user facing messages (things
> > like HTTP error messages that can appear in a normal request) ?
>
> That was completely unexpected. Some serious food for thought here.
>

Same :D

>
> > I think it's worth considering because:
> > - The target audience of Tomcat is mainly developers and administrators
> > which are used to read English text.
>
> My primary concern is that it could make Tomcat less accessible to
> non-native Eglish speakers. Granted all of our documentation, nearly
> every message on the mailing list, issue tracker, code comments etc are
> all in English but is that a reason to drop attempts to provide i10n or
> is it an indication we aren't doing nearly enough?
>
> The irony of asking that question on a mailing list where messages are
> in English hasn't escaped me.
>

Yes, the question is: is it ever going to be possible to use Tomcat for
someone who doesn't understand english at all (since the req level is
pretty low).

However, whatever happens, using a string bundle should remain mandatory.

>
> > - The coverage of the translations is rather low.
>
> They tend to get done once at a point in time where they are close to
> 100% and then tail of as the code is refactored and new features added.
>
> > - Maintaining the translations, the quality and the consistency is
> > difficult and time consuming.
>
> There has been very little of this. I recall a big donation of Spanish
> translations, the recent Russian additions and then, apart from those,
> the odd typo fix here and there. My hope was that, with a tool like
> POEditor, it would be easier for contributors to improve and/or add to
> the translations.
>
> > - Sometimes the translation of the technical terms are a bit unusual and
> > not as clear as the English counterpart. For example in French it isn't
> > obvious that "gestionnaire de protocole" relates to ProtocolHandler
> > which is an internal Tomcat concept. Other translations are even funny
> > like "enrobeur de conteneur" for "wrapper container" (a pastry concept
> > applied to a freight container?). This issue is so common with the
> > French translation that many messages carry the English terms in
> > parentheses to clarify the meaning.
>
> If my French was a lot better, I might just start reading the
> translations to enjoy the humour. Seriously, we (well, those in the
> community that speak French fluently - not me) could look to improve
> those. I think it might be better to not translate class/interface names
> like "ProtocolHandler", "Realm" or maybe put the translations in
> brackets. As for the shipping container wrapped in pastry, I assume a
> better translation is possible.
>

It might sound funny, but the pastry thing is correct.

>
> But that brings us back to your original question of whether the
> translations are worth it. If (and it is a fairly big if) the
> translations were mostly complete and mostly of good quality, would that
> change your view? I'm thinking try and improve the translations as a
> first step and, if things don't improve, then decide what to do next.
>

Ok for trying !

>
> Removing the translations (apart from the UI) feels to be too big a step
> to me. That said, I can see how they would be a hindrance rather than a
> help to some. Perhaps separating the l10n JARs into user facing and
> external would give more options. Admins could remove the translated JAR
> for the internal messages and get those messages in English if they
> prefer. Or we could ship less or even no translations for internal
> messages by default and provide them as a separate download.
>

"apart from the UI", like the manager webapp ? It might sounds obvious, but
if Tomcat is not usable by someone who doesn't understand english, then it
will not add any value but only add confusing translations depending on the
language configured in the user browser.

Rémy


Re: L10n / I18n

2018-11-08 Thread Mark Thomas
On 08/11/2018 00:16, Emmanuel Bourg wrote:
> Le 07/11/2018 à 23:36, Mark Thomas a écrit :
> 
>> WDYT?
> 
> What about simplifying the issue by dropping the translations of the
> internal messages and retaining only the user facing messages (things
> like HTTP error messages that can appear in a normal request) ?

That was completely unexpected. Some serious food for thought here.

> I think it's worth considering because:
> - The target audience of Tomcat is mainly developers and administrators
> which are used to read English text.

My primary concern is that it could make Tomcat less accessible to
non-native Eglish speakers. Granted all of our documentation, nearly
every message on the mailing list, issue tracker, code comments etc are
all in English but is that a reason to drop attempts to provide i10n or
is it an indication we aren't doing nearly enough?

The irony of asking that question on a mailing list where messages are
in English hasn't escaped me.

> - The coverage of the translations is rather low.

They tend to get done once at a point in time where they are close to
100% and then tail of as the code is refactored and new features added.

> - Maintaining the translations, the quality and the consistency is
> difficult and time consuming.

There has been very little of this. I recall a big donation of Spanish
translations, the recent Russian additions and then, apart from those,
the odd typo fix here and there. My hope was that, with a tool like
POEditor, it would be easier for contributors to improve and/or add to
the translations.

> - Sometimes the translation of the technical terms are a bit unusual and
> not as clear as the English counterpart. For example in French it isn't
> obvious that "gestionnaire de protocole" relates to ProtocolHandler
> which is an internal Tomcat concept. Other translations are even funny
> like "enrobeur de conteneur" for "wrapper container" (a pastry concept
> applied to a freight container?). This issue is so common with the
> French translation that many messages carry the English terms in
> parentheses to clarify the meaning.

If my French was a lot better, I might just start reading the
translations to enjoy the humour. Seriously, we (well, those in the
community that speak French fluently - not me) could look to improve
those. I think it might be better to not translate class/interface names
like "ProtocolHandler", "Realm" or maybe put the translations in
brackets. As for the shipping container wrapped in pastry, I assume a
better translation is possible.

But that brings us back to your original question of whether the
translations are worth it. If (and it is a fairly big if) the
translations were mostly complete and mostly of good quality, would that
change your view? I'm thinking try and improve the translations as a
first step and, if things don't improve, then decide what to do next.

Removing the translations (apart from the UI) feels to be too big a step
to me. That said, I can see how they would be a hindrance rather than a
help to some. Perhaps separating the l10n JARs into user facing and
external would give more options. Admins could remove the translated JAR
for the internal messages and get those messages in English if they
prefer. Or we could ship less or even no translations for internal
messages by default and provide them as a separate download.

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: L10n / I18n

2018-11-07 Thread Emmanuel Bourg
Le 07/11/2018 à 23:36, Mark Thomas a écrit :

> WDYT?

What about simplifying the issue by dropping the translations of the
internal messages and retaining only the user facing messages (things
like HTTP error messages that can appear in a normal request) ?

I think it's worth considering because:
- The target audience of Tomcat is mainly developers and administrators
which are used to read English text.
- The coverage of the translations is rather low.
- Maintaining the translations, the quality and the consistency is
difficult and time consuming.
- Sometimes the translation of the technical terms are a bit unusual and
not as clear as the English counterpart. For example in French it isn't
obvious that "gestionnaire de protocole" relates to ProtocolHandler
which is an internal Tomcat concept. Other translations are even funny
like "enrobeur de conteneur" for "wrapper container" (a pastry concept
applied to a freight container?). This issue is so common with the
French translation that many messages carry the English terms in
parentheses to clarify the meaning.

Emmanuel Bourg

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



L10n / I18n

2018-11-07 Thread Mark Thomas
Hi,

After looking at bug 62843, I got thinking about tools to help manage
translations from contributors. Something that would show the key, the
original value and the translated value side by side.

I looked at Pootle but that was more run it yourself. My preference was
for something that was hosted. I then looked at POEditor (poeditor.com)
and that seemed to fit the bill.

It seems that these tools all expect a single file per language (at leat
the two I have looked at so far do) so I wrote some code to merge the
LocalString.properties files into a single file per language (I prefixed
the keys with the package name to ensure they remained unique).

Having uploaded these, the tool identified ~20 keys that existed in
translations but not in the original. Hence the handful of commits this
afternoon cleaning those up.

What we are left with is the following:

French 18%
German  2%
Japanese   21%
Portuguese  1%
Russian 8%
Spanish42%

(% is the number of keys translated into that language)

POEditor offer free unlimited plans for OSI approved licensed software
(that includes Tomcat).

What I would like to do is announce this on the users list and invite
contributors to start adding translations - potentially for new languages.

However, there is a catch. How to get the translations back into Tomcat?
I'll need to write some more code to do this - that isn't an issue. The
issue is that retaining the current comments and ordering in the
translated files would be a LOT of work. It would be a lot easier if I
could just write the keys out in alphabetical order with each block
(determined by the key value up to the first period) separated by a
blank line. Would that be acceptable?

If it is, I'll clean the English files up by hand so that the comments
are retained for those files.

WDYT?

Mark

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org