----- Original Message -----
> From: "Dan Kenigsberg" <dan...@redhat.com>
> To: "Martin Sivak" <msi...@redhat.com>, dc...@redhat.com
> Cc: vdsm-devel@lists.fedorahosted.org
> Sent: Thursday, June 20, 2013 3:08:29 PM
> Subject: Re: environment encoding, LC_ALL and vdsm tests
> 
> On Thu, Jun 20, 2013 at 05:50:16AM -0400, Martin Sivak wrote:
> > Hi,
> > 
> > recently I discovered an issue with our Jenkins test environment. It was
> > failing in testHooks.py because my Gerrit name contains diacritics and our
> > code tried to decode it as ascii.
> > 
> > Traceback (most recent call last):
> >   File "/usr/lib64/python2.6/unittest.py", line 278, in run
> >     testMethod()
> >   File "/ephemeral0/vdsm_unit_tests_gerrit_el/tests/hooksTests.py", line
> >   125, in test_deviceCustomProperties
> >     params={'customProperty': ' rocks!'})
> >   File "/ephemeral0/vdsm_unit_tests_gerrit_el/vdsm/hooks.py", line 70, in
> >   _runHooksDir
> >     scriptenv[k] = unicode(v).encode('utf-8')
> > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 12:
> > ordinal not in range(128)
> > 
> > The relevant code is here:
> > 
> > hooks.py:
> > 
> > 60          scriptenv = os.environ.copy()
> > ...
> > 69          for k, v in scriptenv.iteritems():
> > 70              scriptenv[k] = unicode(v).encode('utf-8')
> > 
> > My first instinct was to decode it using the proper encoding:
> > 
> > source_encoding = sys.stdin.encoding or locale.getpreferredencoding()
> > for k, v in scriptenv.iteritems():
> >     scriptenv[k] = v.decode(source_encoding).encode('utf-8')
> > 
> > But it still did not work. So I tried to print out the environment and
> > encodings that are used when make check is being run and got this:
> > 
> > sys.stdin.encoding == None
> > locale.getpreferredencoding() -> ANSI_X3.4-1968
> > os.environ['LC_ALL'] == 'C'
> > os.environ['LANG'] == 'en_US.UTF-8'
> > 
> > Please notice the encoding part, my system and terminal are using utf-8,
> > but vdsm reads the environment values using ANSI. That is obviously wrong
> > and can't work.
> > 
> > So i tried to investigate it further and found out we force LC_ALL to C in
> > vdsmd.init, run_tests.sh.in and run_tests_local.sh.in.
> > 
> > I also found the commit that introduced this -
> > 107644dbad9af250c00e7f25fc51a92c6250d442 - and finally understood where
> > the issue was.
> > 
> > Although I understand the reasons for the patch, I do not agree with
> > it. If we are executing other tools and parse their output, we should
> > be preparing and passing the updated locale _only_ to those tools. We
> > should not be setting the locale we need for parsing stuff to the
> > whole vdsm daemon.
> 
> Since vdsm is not intended for direct human control, I actually like the
> idea of turning off all locale noise by a global LC_ALL=C. The
> alternative, of setting it to C before each application with parsed
> output seems tedious and easily forgotten.
> 
> >
> > Our current practice of setting LC_ALL to C no matter on what terminal
> > or system we are starting vdsmd is causing us the above mentioned
> > issue, because the environment can (and does) contain data in the
> > system encoding. This essentially prevents anybody with utf-8 chars in
> > their names to submit anything to vdsm.
> 
> No doubt that we have to fix it. The easiest hack is to ask our Jenkins
> job to clear the Jenkins env vars before calling `make check`. I'm sure
> David (CCed) can do it quite easily.

Yes, that should be easy, if you decide to do that, it can be done in 30min 
(smallest fraction of time for a task).

> 
> >
> > So I would like to start a discussion about this that will lead to the
> > necessary fixes and change in our current practice :)
> 
> Unfortunately, I have no idea beyond exterminating non-7-bit chars from
> the environment and setting LC_ALL=C in n+1 places.
> 
> The first approach may not be so horrible as it seems: I'm not sure we
> should pass every vdsm env variable to the hook scripts. Passing only
> ascii ones may be good enough.
> 
> Obviously, unicode custom properties should continue to be explicitly
> added, with utf-8 encoding, to the script environment, as this is a
> documented vdsm API.
> 
> Dan.
> 
_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to