[Bug 989496] Re: UnicodeDecodeError during backup due to non-utf8 translation

Milan Bouchet-Valat Thu, 08 Aug 2013 02:26:17 -0700

I think I have found a fix.

The bug does not happen only with invalid UTF-8 filenames, you simply
need UTF-8 filenames and a UTF-8 locale.


For example, in collections.py:810, there is:
                log.Debug(_("File %s is not part of a known set; creating new 
set") % (filename,))

On my system, when this fails (see error below), the _() string is a str object 
encoded in UTF-8; filename is a unicode object. The error below happens while 
Python encodes filename into an ASCII str object. If the _() string is a 
unicode object too, no encoding into a str object happens at this stage, and 
everything works. This can be achieved by setting gettext up differently in 
__init__.py, by passing unicode=True to gettext.install(). This is the solution 
recommended by the author of gettext for Python:
http://www.wefearchange.org/2012/06/the-right-way-to-internationalize-your.html

This change requires a few modifications in other places so that only
unicode strings are passed to the logger. I'm attaching a diff of quick
and dirty changes I applied to demonstrate the idea.

Any chance to get some attention for this bug? This makes duplicity
completely unusable on my system for more than a year.


This is with duplicity 0.6.21 on Fedora 19.

Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1411, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1404, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1257, in main
    action = commandline.ProcessCommandLine(sys.argv[1:])
  File "/usr/lib64/python2.7/site-packages/duplicity/commandline.py", line 981, 
in ProcessCommandLine
    args = parse_cmdline_options(cmdline_list)
  File "/usr/lib64/python2.7/site-packages/duplicity/commandline.py", line 644, 
in parse_cmdline_options
    log.Info(_("Using archive dir: %s") % (globals.archive_dir.name,))
  File "/usr/lib64/python2.7/site-packages/duplicity/log.py", line 106, in Info
    Log(s, INFO, code, extra)
  File "/usr/lib64/python2.7/site-packages/duplicity/log.py", line 74, in Log
    _logger.log(DupToLoggerLevel(verb_level), s.decode("utf8", "ignore"))
  File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 
16: ordinal not in range(128)

** Patch added: "duplicity.patch"
   
https://bugs.launchpad.net/ubuntu/+source/duplicity/+bug/989496/+attachment/3764630/+files/duplicity.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/989496

Title:
  UnicodeDecodeError during backup due to non-utf8 translation

To manage notifications about this bug go to:
https://bugs.launchpad.net/duplicity/+bug/989496/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 989496] Re: UnicodeDecodeError during backup due to non-utf8 translation

Reply via email to