[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2022-04-07 Thread Semyon


Change by Semyon :


--
nosy: +MarSoft

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2021-12-12 Thread Irit Katriel


Change by Irit Katriel :


--
versions: +Python 3.11 -Python 3.3, Python 3.4

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2021-12-03 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

I forgot to add this:

 - we may not want to follow the behavior of command line unzip - it's 
interactive so considerations are somewhat different. For example, it will warn 
if file is being overwritten and show a prompt on whether to skip, overwrite, 
abort, etc.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2021-12-03 Thread Andrei Kulakov


Andrei Kulakov  added the comment:

I think it may be good enough to add a warning on skipped files in 
_unpack_zipfile().

 - this way we keep backwards compatibility (especially since behavior in both 
modules differed for such a long time.)

 - it's not clear that ZipFile behavior is superior -- for example, what if a 
file with stripped path components overwrites existing files?

 - if requested in the future, a parameter can be added to enable ZipFile-like 
behavior

 - it can be very confusing if files are silently skipped, especially if an 
archive has thousands of files. 

I've added a PR, note that the test in PR also tests that files with '..' are 
indeed skipped, we don't have a test for that now, so that's an added benefit.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2021-12-03 Thread Andrei Kulakov


Change by Andrei Kulakov :


--
nosy: +andrei.avk
nosy_count: 6.0 -> 7.0
pull_requests: +28135
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/29910

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2021-07-06 Thread huantian


huantian  added the comment:

This issue is somewhat old, but I think either applying the patch attached 
earlier or noting this difference in behavior in the documentation would be 
ideal, as it can be quite difficult to debug shutil.unpack_archive silently 
skipping files. If the files you expect to unpack do contain '..' for 
non-malicious purposes this makes shutil.unpack_archive unsuitable for the 
situation.

--
nosy: +huantian

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-20 Thread Peter Santoro

Peter Santoro added the comment:

It seems clear to me that the logic in shutil._unpack_zipfile that silently 
skips paths that start with '/' (indicates absolute path) or that contain 
references to the parent directory ('..') was added to prevent malicious zip 
files from making potential malicious/unwanted modifications to the filesystem 
(perhaps at a time when zipfile did not itself contain such logic).  This 
conservative approach works, but it can have unexpected results.  For example, 
if all entries in a zip file contain these invalid characters, then 
shutil._unpack_zipfile appears to do nothing (i.e. the zip file is not 
unpacked).  This is good (except for the silent part), if the zip file is truly 
malicious.  However, I recently had to deal with thousands of zip files created 
by well known software vendors where hundreds of the zip files were created 
incorrectly and contained these invalid characters.  These files were not 
malicious, but they were created improperly. Note that shutil._unpack_zipfile 
silently fai
 led to unzip these files, but by using ZipFile.extractall I could unzip them.

It appears that most unzipping software today either either ignores (sometimes 
silently) potentially malicious zip entries (e.g. Windows 7 Explorer displays 
an invalid zip file error) or it attempts to filter out/replace known bad 
characters so that the zip entries can be extracted (e.g. WinZip, gnu unzip).  
I created this issue because the Python library uses both approaches, which may 
need rethinking.

The newer logic in ZipFile._extract_member, which is used by 
ZipFile.extractall, takes a different approach.  Instead of silently ignoring 
potentially malicious zip entries, it attempts to filter out or replace known 
invalid characters before extracting the zip entries.

From the Python zipfile docs:
---
If a member filename is an absolute path, a drive/UNC sharepoint and leading 
(back)slashes will be stripped, e.g.: ///foo/bar becomes foo/bar on Unix, and 
C:\foo\bar becomes foo\bar on Windows. And all .. components in a member 
filename will be removed, e.g.: ../../foo../../ba..r becomes foo../ba..r. On 
Windows illegal characters (:, , , |, , ?, and *) replaced by underscore (_).
---

As ZipFile._extract_member filters out/replaces more invalid characters than 
shutil._unpack_zipfile handles, one could argue that the (apparent older) 
approach used by shutil._unpack_zipfile is less safe.

The approaches used by shutil._unpack_zipfile and ZipFile.extractall to deal 
with potentially malicious zip file entries are different.  This issue could be 
closed if not deemed important by the Python core developers or it could be 
handled by documentation and/or coding changes.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-20 Thread R. David Murray

R. David Murray added the comment:

Note that unix unzip does exactly the same thing as zipfile extractall (except 
that it does issue warnings), and I believe this is considered best practice 
these days for extraction tools: strip out absolute/relative path components 
and extract to the destination directory.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-19 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Well, you have to assume the code you're removing is there for a reason (e.g. 
perhaps this is meant to protect from attacks when opening a zip file uploaded 
by a user). 

I'd like to hear from Éric on this.

--
nosy: +eric.araujo, pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-19 Thread R. David Murray

R. David Murray added the comment:

First step would be to get rid of the warning in the zipfile docs and replace 
it with the info that the absolute path '/' and any relative path are stripped 
silently before the file is extracted.

It would also be worth adding an enhancement to zipfile to optionally not do it 
silently.

I hope the same considerations apply to tarfile, but I haven't checked.

In other words, I do think that code is a holdover from when zipfile *wasn't* 
safe, but since I didn't write it I don't know for sure.

--
assignee:  - docs@python
components: +Documentation
nosy: +docs@python, r.david.murray
stage: test needed - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-19 Thread Éric Araujo

Éric Araujo added the comment:

shutil.unpack_archive was extracted from distutils by Tarek.  I can do some 
Mercurial archaelogy to find more about the behaviour.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-13 Thread Peter Santoro

New submission from Peter Santoro:

Since Python 3.3.1, ZipFile.extractall was enhanced to better handle absolute 
paths and illegal characters.  The associated logic within 
shutil._unpack_zipfile essentially skips zip members with these issues.

If a zip file contains all absolute paths, ZipFile.extractall works as expected 
(i.e. the zip file is unpacked), but shutil._unpack_zipfile (normally called 
indirectly via shutil.unpack_archive) appears to do nothing (i.e. it silently 
fails to unpack the zip file).

The attached patch attempts to unify the behavior of extracting zip files 
between shutil.unpack_archive with ZipFile.extractall.

--
components: Library (Lib)
files: shutil.diff
keywords: patch
messages: 213374
nosy: pe...@psantoro.net
priority: normal
severity: normal
status: open
title: behavioral differences between shutil.unpack_archive and 
ZipFile.extractall
type: behavior
versions: Python 3.3, Python 3.4
Added file: http://bugs.python.org/file34393/shutil.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-13 Thread Éric Araujo

Changes by Éric Araujo mer...@netwok.org:


--
stage:  - test needed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue20907] behavioral differences between shutil.unpack_archive and ZipFile.extractall

2014-03-13 Thread Peter Santoro

Peter Santoro added the comment:

I've attached a zip file which contains a test script and test zip files for 
the previously submitted Python 3.3.5 patch.  See the included README.txt for 
more information.  To view the contents of the included bad.zip file, use the 
following command:

 unzip -l bad.zip

--
Added file: http://bugs.python.org/file34408/test_unpack.zip

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue20907
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com