[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-11-29 Thread STINNER Victor

STINNER Victor added the comment:

FYI the first release including the fix 78ede2baa146 is Python 3.5.2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread Berker Peksag

Changes by Berker Peksag :


--
resolution:  -> fixed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread Lars Gustäbel

Lars Gustäbel added the comment:

Sorry for the glitch, I suppose everything works fine now.

--
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 78ede2baa146 by Lars Gustäbel in branch '3.5':
Issue #24838: Fix test_tarfile.py for non-utf8 filesystem encodings.
https://hg.python.org/cpython/rev/78ede2baa146

New changeset 08835d1e7a50 by Lars Gustäbel in branch 'default':
Issue #24838: Merge test_tarfile.py fix from 3.5.
https://hg.python.org/cpython/rev/08835d1e7a50

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread Serhiy Storchaka

Changes by Serhiy Storchaka :


--
nosy: +serhiy.storchaka

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread STINNER Victor

STINNER Victor added the comment:

Tests fail on FreeBSD:

http://buildbot.python.org/all/builders/AMD64%20FreeBSD%209.x%203.5/builds/713/steps/test/logs/stdio

Example:



==
FAIL: test_unicode_link1 (test.test_tarfile.UstarUnicodeTest)
--
Traceback (most recent call last):
  File 
"/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/test/test_tarfile.py", 
line 1807, in test_unicode_link1
self._test_ustar_link("0123456789" * 9 + "01234567\xff")
  File 
"/usr/home/buildbot/python/3.5.koobs-freebsd9/build/Lib/test/test_tarfile.py", 
line 1826, in _test_ustar_link
self.assertEqual(name, t.linkname)
AssertionError: '0123[44 
chars]89012345678901234567890123456789012345678901234567\xff' != '0123[44 
chars]89012345678901234567890123456789012345678901234567\udcc3\udcbf'
- 
01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567\xff
?   
^
+ 
01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567\udcc3\udcbf
?   
^^

--
nosy: +haypo
resolution: fixed -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread Lars Gustäbel

Changes by Lars Gustäbel :


--
resolution:  -> fixed
stage: test needed -> resolved
status: open -> closed
versions:  -Python 3.2, Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2016-04-19 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d08d6b776694 by Lars Gustäbel in branch '3.5':
Issue #24838: tarfile's ustar and gnu formats now correctly calculate name and
https://hg.python.org/cpython/rev/d08d6b776694

New changeset e281a57d5b29 by Lars Gustäbel in branch 'default':
Issue #24838: Merge tarfile fix from 3.5.
https://hg.python.org/cpython/rev/e281a57d5b29

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2015-08-14 Thread Lars Gustäbel

Lars Gustäbel added the comment:

Thanks for the detailed report and the patch. I haven't checked yet, but I 
suppose that the entire 3.x branch is affected. The first thing I have to do 
now is to come up with a comprehensive testcase.

--
assignee:  - lars.gustaebel
components: +Library (Lib)
nosy: +lars.gustaebel
stage:  - test needed
versions: +Python 3.2, Python 3.3, Python 3.4, Python 3.6

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24838
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24838] tarfile.py: fix GNU and USTAR formats to properly handle paths with special characters that are encoded with more than one byte each

2015-08-10 Thread Roddy Shuler

New submission from Roddy Shuler:

GNU and USTAR formats use a special case if the file path is longer than 100 
bytes. The detection for this, though, incorrectly checked for 100 characters 
rather than 100 bytes. So, if the length was close to but not exceeding 100 
characters and included special characters such that the encoded length is 
greater than 100 bytes, the encoded string was truncated to 100 bytes and thus 
the resulting file name was truncated within the tar file.

For example...

/gt-education/Colección Educativa Guatemala/thumbs/Libro de Texto Comunicacion 
y Lenguaje 1 Grado.jpg

is truncated as:

/gt-education/Colección Educativa Guatemala/thumbs/Libro de Texto Comunicacion 
y Lenguaje 1 Grado.jp

The attached patch fixes this.  Initially found on Python 3.3.  Patch is tested 
on Linux with version 3.4.3-6 from Debian.  Looking at the source code, I am 
pretty confident that the problem still exists upstream in Python 3.5.

--
files: fix-tarfile-path-truncation.patch
keywords: patch
messages: 248363
nosy: Roddy Shuler
priority: normal
severity: normal
status: open
title: tarfile.py: fix GNU and USTAR formats to properly handle paths with 
special characters that are encoded with more than one byte each
type: behavior
versions: Python 3.5
Added file: http://bugs.python.org/file40157/fix-tarfile-path-truncation.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24838
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com