[openstack-dev] [nova] translations gone wild
This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. What's the right way to test / checkpoint on this on a regular basis? -Sean -- Sean Dague http://dague.net __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] translations gone wild
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I have some experience in i18n for other project (GNOME), so I have some answers to your questions. On 02/26/2015 02:18 PM, Sean Dague wrote: This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. No, those messages are marked as 'fuzzy'. They were filled in by gettext heuristics, but without a translator to go thru them and unmark them as 'fuzzy', the translations are not applied. https://www.gnu.org/software/gettext/manual/html_node/Fuzzy-Entries.html That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. In theory, en_US locale may be used to fix typos without touching the translatable message (to avoid getting the existing messages in other languages untranslated). This approach is usually applied when in message freeze mode, when developers are forbidden to touch the messages, but translators still want to remove typos for en_US users. I don't know how this applies to openstack though, since I'm not sure whether we enforce string freeze here, and whether translators care about occasional typos in en_US UX. What's the right way to test / checkpoint on this on a regular basis? -Sean -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJU7x9kAAoJEC5aWaUY1u577mIH/07l7SaoyQ05+e+grS5XM4Pw PDAdtEqPFLGJtdyV+9QpvPWbJ2xF1HVC2mTPYmJIStGB9ZOEaFGU9JlCQrmFCM+Q iS9m60H/ifRlm+pIK5y+o67D4J/Zj8PmuBRfmHO6wYahUYC0FUcKOQj4Zk5RKLZI sh2MKrtXpYGqQSg+KcpubEtKO0MFYx2V3OfqdtrECiufuQjC+dxiViNUvE6C6rmr CE6A0yFGa2kxlF62Zy8l05FjGrE2mO/sKMPjbmkrzqzt8b90M7ltK4wKidXRK3qZ 8KXTUtTGDCjicbxxnaRZDLCeK2ZlNpIIx2DncTw73oxY5pv3dEJZJIYhAp82lQA= =2PSa -END PGP SIGNATURE- __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] translations gone wild
2015-02-26 22:27 GMT+09:00 Tom Fifield t...@openstack.org: On 26/02/15 21:18, Sean Dague wrote: This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. What's the right way to test / checkpoint on this on a regular basis? -Sean en_US does not exist on transifex. It existed once by mistake, but was later removed. This is probably why it's in a weird state. I think that file should be deleted. en (or en_US) language is a source language for all translations, so it should not be imported into our git repositories. It is the source languages, but if translations exist for these language gettext use it. The thing in nova previously occurred in Horizon one or two years ago. It was due to our misconfiguraitons in Transifex and after that en and en_US in Transifex were removed. If en and en_US locales exist in our git repositories, we can safely delete them and the infra translation script can ignore them. Thanks, Akihiro Regards, Tom __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] translations gone wild
On 02/26/2015 02:27 PM, Tom Fifield wrote: On 26/02/15 21:18, Sean Dague wrote: This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. What's the right way to test / checkpoint on this on a regular basis? -Sean en_US does not exist on transifex. It existed once by mistake, but was later removed. This is probably why it's in a weird state. I think that file should be deleted. Changes send in to delete it from Glance and Nova, hope those were all, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] translations gone wild
On 26/02/15 21:18, Sean Dague wrote: This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. What's the right way to test / checkpoint on this on a regular basis? -Sean en_US does not exist on transifex. It existed once by mistake, but was later removed. This is probably why it's in a weird state. I think that file should be deleted. Regards, Tom __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] translations gone wild
On 02/26/2015 02:18 PM, Sean Dague wrote: This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. What's the right way to test / checkpoint on this on a regular basis? Oh, en_US is not at all in transifex - our translation tool -, so it's safe to remove it, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] translations gone wild
On 02/26/2015 08:28 AM, Ihar Hrachyshka wrote: I have some experience in i18n for other project (GNOME), so I have some answers to your questions. On 02/26/2015 02:18 PM, Sean Dague wrote: This morning in the nova channel we were trying to get to the bottom of the unit tests failing lxsi and gillard in en_GB on some string comparisons. Something is breaking down in our i18n null fixture for the tests. However, in trying to track down the route of their messages I ran into things like this: https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L1410-L1411 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3481-L3485 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L5790-L5793 https://github.com/openstack/nova/blob/master/nova/locale/en_US/LC_MESSAGES/nova.po#L3278-L3282 So, correct me if I'm wrong, but I think that means that when running in en_US those log messages are going to get overridden. And in many of these cases they are getting overridden to completely unrelated messages. No, those messages are marked as 'fuzzy'. They were filled in by gettext heuristics, but without a translator to go thru them and unmark them as 'fuzzy', the translations are not applied. https://www.gnu.org/software/gettext/manual/html_node/Fuzzy-Entries.html Ok, awesome. Good to learn new things. That seems quite dangerous. Is there a reason that en_US locale tree exists at all (given that we've treated it as base locale historically). It seems like it's existence can only cause issues. In theory, en_US locale may be used to fix typos without touching the translatable message (to avoid getting the existing messages in other languages untranslated). This approach is usually applied when in message freeze mode, when developers are forbidden to touch the messages, but translators still want to remove typos for en_US users. I don't know how this applies to openstack though, since I'm not sure whether we enforce string freeze here, and whether translators care about occasional typos in en_US UX. We do string freeze in theory, in practice I'm not sure how careful things end up being. -Sean -- Sean Dague http://dague.net signature.asc Description: OpenPGP digital signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev