[openstack-dev] [nova] translations gone wild

2015-02-26 Thread Sean Dague
This morning in the nova channel we were trying to get to the bottom of
the unit tests failing lxsi and gillard in en_GB on some string
comparisons. Something is breaking down in our i18n null fixture for the
tests.

However, in trying to track down the route of their messages I ran into
things like this:

https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411


https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485

https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793


https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282



So, correct me if I'm wrong, but I think that means that when running in
en_US those log messages are going to get overridden. And in many of
these cases they are getting overridden to completely unrelated messages.

That seems quite dangerous. Is there a reason that en_US locale tree
exists at all (given that we've treated it as base locale historically).
It seems like it's existence can only cause issues.

What's the right way to test / checkpoint on this on a regular basis?

-Sean

-- 
Sean Dague
http://dague.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] translations gone wild

2015-02-26 Thread Ihar Hrachyshka
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I have some experience in i18n for other project (GNOME), so I have
some answers to your questions.

On 02/26/2015 02:18 PM, Sean Dague wrote:
 This morning in the nova channel we were trying to get to the
 bottom of the unit tests failing lxsi and gillard in en_GB on some
 string comparisons. Something is breaking down in our i18n null
 fixture for the tests.
 
 However, in trying to track down the route of their messages I ran
 into things like this:
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411

 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485

  
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793

 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282

 
 
 
 So, correct me if I'm wrong, but I think that means that when
 running in en_US those log messages are going to get overridden.
 And in many of these cases they are getting overridden to
 completely unrelated messages.

No, those messages are marked as 'fuzzy'. They were filled in by
gettext heuristics, but without a translator to go thru them and
unmark them as 'fuzzy', the translations are not applied.

https://www.gnu.org/software/gettext/manual/html_node/Fuzzy-Entries.html

 
 That seems quite dangerous. Is there a reason that en_US locale
 tree exists at all (given that we've treated it as base locale
 historically). It seems like it's existence can only cause issues.

In theory, en_US locale may be used to fix typos without touching the
translatable message (to avoid getting the existing messages in other
languages untranslated). This approach is usually applied when in
message freeze mode, when developers are forbidden to touch the
messages, but translators still want to remove typos for en_US users.

I don't know how this applies to openstack though, since I'm not sure
whether we enforce string freeze here, and whether translators care
about occasional typos in en_US UX.

 
 What's the right way to test / checkpoint on this on a regular
 basis?
 
 -Sean
 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJU7x9kAAoJEC5aWaUY1u577mIH/07l7SaoyQ05+e+grS5XM4Pw
PDAdtEqPFLGJtdyV+9QpvPWbJ2xF1HVC2mTPYmJIStGB9ZOEaFGU9JlCQrmFCM+Q
iS9m60H/ifRlm+pIK5y+o67D4J/Zj8PmuBRfmHO6wYahUYC0FUcKOQj4Zk5RKLZI
sh2MKrtXpYGqQSg+KcpubEtKO0MFYx2V3OfqdtrECiufuQjC+dxiViNUvE6C6rmr
CE6A0yFGa2kxlF62Zy8l05FjGrE2mO/sKMPjbmkrzqzt8b90M7ltK4wKidXRK3qZ
8KXTUtTGDCjicbxxnaRZDLCeK2ZlNpIIx2DncTw73oxY5pv3dEJZJIYhAp82lQA=
=2PSa
-END PGP SIGNATURE-

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] translations gone wild

2015-02-26 Thread Akihiro Motoki
2015-02-26 22:27 GMT+09:00 Tom Fifield t...@openstack.org:
 On 26/02/15 21:18, Sean Dague wrote:
 This morning in the nova channel we were trying to get to the bottom of
 the unit tests failing lxsi and gillard in en_GB on some string
 comparisons. Something is breaking down in our i18n null fixture for the
 tests.

 However, in trying to track down the route of their messages I ran into
 things like this:

 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411


 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485

 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793


 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282



 So, correct me if I'm wrong, but I think that means that when running in
 en_US those log messages are going to get overridden. And in many of
 these cases they are getting overridden to completely unrelated messages.

 That seems quite dangerous. Is there a reason that en_US locale tree
 exists at all (given that we've treated it as base locale historically).
 It seems like it's existence can only cause issues.

 What's the right way to test / checkpoint on this on a regular basis?

   -Sean


 en_US does not exist on transifex. It existed once by mistake, but was
 later removed. This is probably why it's in a weird state. I think that
 file should be deleted.

en (or en_US) language is a source language for all translations,
so it should not be imported into our git repositories.
It is the source languages, but if translations exist for these language
gettext use it.

The thing in nova previously occurred in Horizon one or two years ago.
It was due to our misconfiguraitons in Transifex and after that en and en_US
in Transifex were removed.

If en and en_US locales exist in our git repositories, we can safely delete them
and the infra translation script can ignore them.


Thanks,
Akihiro



 Regards,


 Tom


 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] translations gone wild

2015-02-26 Thread Andreas Jaeger
On 02/26/2015 02:27 PM, Tom Fifield wrote:
 On 26/02/15 21:18, Sean Dague wrote:
 This morning in the nova channel we were trying to get to the bottom of
 the unit tests failing lxsi and gillard in en_GB on some string
 comparisons. Something is breaking down in our i18n null fixture for the
 tests.

 However, in trying to track down the route of their messages I ran into
 things like this:

 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411


 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485

 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793


 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282



 So, correct me if I'm wrong, but I think that means that when running in
 en_US those log messages are going to get overridden. And in many of
 these cases they are getting overridden to completely unrelated messages.

 That seems quite dangerous. Is there a reason that en_US locale tree
 exists at all (given that we've treated it as base locale historically).
 It seems like it's existence can only cause issues.

 What's the right way to test / checkpoint on this on a regular basis?

  -Sean
 
 
 en_US does not exist on transifex. It existed once by mistake, but was
 later removed. This is probably why it's in a weird state. I think that
 file should be deleted.

Changes send in to delete it from Glance and Nova, hope those were all,

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
   Graham Norton, HRB 21284 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] translations gone wild

2015-02-26 Thread Tom Fifield
On 26/02/15 21:18, Sean Dague wrote:
 This morning in the nova channel we were trying to get to the bottom of
 the unit tests failing lxsi and gillard in en_GB on some string
 comparisons. Something is breaking down in our i18n null fixture for the
 tests.
 
 However, in trying to track down the route of their messages I ran into
 things like this:
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282
 
 
 
 So, correct me if I'm wrong, but I think that means that when running in
 en_US those log messages are going to get overridden. And in many of
 these cases they are getting overridden to completely unrelated messages.
 
 That seems quite dangerous. Is there a reason that en_US locale tree
 exists at all (given that we've treated it as base locale historically).
 It seems like it's existence can only cause issues.
 
 What's the right way to test / checkpoint on this on a regular basis?
 
   -Sean


en_US does not exist on transifex. It existed once by mistake, but was
later removed. This is probably why it's in a weird state. I think that
file should be deleted.


Regards,


Tom


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] translations gone wild

2015-02-26 Thread Andreas Jaeger
On 02/26/2015 02:18 PM, Sean Dague wrote:
 This morning in the nova channel we were trying to get to the bottom of
 the unit tests failing lxsi and gillard in en_GB on some string
 comparisons. Something is breaking down in our i18n null fixture for the
 tests.
 
 However, in trying to track down the route of their messages I ran into
 things like this:
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282
 
 
 
 So, correct me if I'm wrong, but I think that means that when running in
 en_US those log messages are going to get overridden. And in many of
 these cases they are getting overridden to completely unrelated messages.
 
 That seems quite dangerous. Is there a reason that en_US locale tree
 exists at all (given that we've treated it as base locale historically).
 It seems like it's existence can only cause issues.
 
 What's the right way to test / checkpoint on this on a regular basis?

Oh, en_US is not at all in transifex - our translation tool -, so it's
safe to remove it,

Andreas
-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
   Graham Norton, HRB 21284 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] translations gone wild

2015-02-26 Thread Sean Dague
On 02/26/2015 08:28 AM, Ihar Hrachyshka wrote:
 I have some experience in i18n for other project (GNOME), so I have
 some answers to your questions.
 
 On 02/26/2015 02:18 PM, Sean Dague wrote:
 This morning in the nova channel we were trying to get to the
 bottom of the unit tests failing lxsi and gillard in en_GB on some
 string comparisons. Something is breaking down in our i18n null
 fixture for the tests.
 
 However, in trying to track down the route of their messages I ran
 into things like this:
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411
 
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793
 
 
 
 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282
 
 
 
 
 So, correct me if I'm wrong, but I think that means that when
 running in en_US those log messages are going to get overridden.
 And in many of these cases they are getting overridden to
 completely unrelated messages.
 
 No, those messages are marked as 'fuzzy'. They were filled in by
 gettext heuristics, but without a translator to go thru them and
 unmark them as 'fuzzy', the translations are not applied.
 
 https://www.gnu.org/software/gettext/manual/html_node/Fuzzy-Entries.html

Ok, awesome. Good to learn new things.

 That seems quite dangerous. Is there a reason that en_US locale
 tree exists at all (given that we've treated it as base locale
 historically). It seems like it's existence can only cause issues.
 
 In theory, en_US locale may be used to fix typos without touching the
 translatable message (to avoid getting the existing messages in other
 languages untranslated). This approach is usually applied when in
 message freeze mode, when developers are forbidden to touch the
 messages, but translators still want to remove typos for en_US users.
 
 I don't know how this applies to openstack though, since I'm not sure
 whether we enforce string freeze here, and whether translators care
 about occasional typos in en_US UX.

We do string freeze in theory, in practice I'm not sure how careful
things end up being.

-Sean

-- 
Sean Dague
http://dague.net



signature.asc
Description: OpenPGP digital signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev