Re: [vdsm] environment encoding, LC_ALL and vdsm tests

David Caro Estevez Tue, 25 Jun 2013 02:44:56 -0700

The changes are made in the jobs (vdsm_unit_tests, vdsm_unit_tests_gerrit, 
vdsm_unit_tests_el).


Please let me know when you solve the problem so I can delete those fixes.



----- Original Message -----
> From: "Dan Kenigsberg" <dan...@redhat.com>
> To: "David Caro Estevez" <dcaro...@redhat.com>
> Cc: "Martin Sivak" <msi...@redhat.com>, vdsm-devel@lists.fedorahosted.org
> Sent: Sunday, June 23, 2013 11:06:30 AM
> Subject: Re: environment encoding, LC_ALL and vdsm tests
> 
> On Thu, Jun 20, 2013 at 12:39:22PM -0400, David Caro Estevez wrote:
> > 
> > ----- Original Message -----
> > > From: "Dan Kenigsberg" <dan...@redhat.com>
> > > To: "Martin Sivak" <msi...@redhat.com>, dc...@redhat.com
> > > Cc: vdsm-devel@lists.fedorahosted.org
> > > Sent: Thursday, June 20, 2013 3:08:29 PM
> > > Subject: Re: environment encoding, LC_ALL and vdsm tests
> > > 
> > > On Thu, Jun 20, 2013 at 05:50:16AM -0400, Martin Sivak wrote:
> > > > Hi,
> > > > 
> > > > recently I discovered an issue with our Jenkins test environment. It
> > > > was
> > > > failing in testHooks.py because my Gerrit name contains diacritics and
> > > > our
> > > > code tried to decode it as ascii.
> > > > 
> > > > Traceback (most recent call last):
> > > >   File "/usr/lib64/python2.6/unittest.py", line 278, in run
> > > >     testMethod()
> > > >   File "/ephemeral0/vdsm_unit_tests_gerrit_el/tests/hooksTests.py",
> > > >   line
> > > >   125, in test_deviceCustomProperties
> > > >     params={'customProperty': ' rocks!'})
> > > >   File "/ephemeral0/vdsm_unit_tests_gerrit_el/vdsm/hooks.py", line 70,
> > > >   in
> > > >   _runHooksDir
> > > >     scriptenv[k] = unicode(v).encode('utf-8')
> > > > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position
> > > > 12:
> > > > ordinal not in range(128)
> > > > 
> > > > The relevant code is here:
> > > > 
> > > > hooks.py:
> > > > 
> > > > 60              scriptenv = os.environ.copy()
> > > > ...
> > > > 69              for k, v in scriptenv.iteritems():
> > > > 70                  scriptenv[k] = unicode(v).encode('utf-8')
> > > > 
> > > > My first instinct was to decode it using the proper encoding:
> > > > 
> > > > source_encoding = sys.stdin.encoding or locale.getpreferredencoding()
> > > > for k, v in scriptenv.iteritems():
> > > >     scriptenv[k] = v.decode(source_encoding).encode('utf-8')
> > > > 
> > > > But it still did not work. So I tried to print out the environment and
> > > > encodings that are used when make check is being run and got this:
> > > > 
> > > > sys.stdin.encoding == None
> > > > locale.getpreferredencoding() -> ANSI_X3.4-1968
> > > > os.environ['LC_ALL'] == 'C'
> > > > os.environ['LANG'] == 'en_US.UTF-8'
> > > > 
> > > > Please notice the encoding part, my system and terminal are using
> > > > utf-8,
> > > > but vdsm reads the environment values using ANSI. That is obviously
> > > > wrong
> > > > and can't work.
> > > > 
> > > > So i tried to investigate it further and found out we force LC_ALL to C
> > > > in
> > > > vdsmd.init, run_tests.sh.in and run_tests_local.sh.in.
> > > > 
> > > > I also found the commit that introduced this -
> > > > 107644dbad9af250c00e7f25fc51a92c6250d442 - and finally understood where
> > > > the issue was.
> > > > 
> > > > Although I understand the reasons for the patch, I do not agree with
> > > > it. If we are executing other tools and parse their output, we should
> > > > be preparing and passing the updated locale _only_ to those tools. We
> > > > should not be setting the locale we need for parsing stuff to the
> > > > whole vdsm daemon.
> > > 
> > > Since vdsm is not intended for direct human control, I actually like the
> > > idea of turning off all locale noise by a global LC_ALL=C. The
> > > alternative, of setting it to C before each application with parsed
> > > output seems tedious and easily forgotten.
> > > 
> > > >
> > > > Our current practice of setting LC_ALL to C no matter on what terminal
> > > > or system we are starting vdsmd is causing us the above mentioned
> > > > issue, because the environment can (and does) contain data in the
> > > > system encoding. This essentially prevents anybody with utf-8 chars in
> > > > their names to submit anything to vdsm.
> > > 
> > > No doubt that we have to fix it. The easiest hack is to ask our Jenkins
> > > job to clear the Jenkins env vars before calling `make check`. I'm sure
> > > David (CCed) can do it quite easily.
> > 
> > Yes, that should be easy, if you decide to do that, it can be done in
> > 30min (smallest fraction of time for a task).
> 
> Please do that, as a quick mitigation of the real problem.
> It *is* important that people can use their real name when contributing
> to vdsm code.
> 
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Re: [vdsm] environment encoding, LC_ALL and vdsm tests

Reply via email to