[issue6196] tarfile.extractall(readaccess=True)

2009-06-06 Thread Lars Gustäbel

Lars Gustäbel l...@gustaebel.de added the comment:

I close this issue then.

--
resolution:  - rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-05 Thread Lars Gustäbel

Lars Gustäbel l...@gustaebel.de added the comment:

I am still not convinced why tarfile needs this kind of a work-around
built in. We talk about a very small number of cases here and the
generator_tools-0.3.5.tar.gz is really broken beyond repair. It is the
only thing that should be fixed here IMO ;-)
I agree with David here. It is easy to manipulate the tarfile in
advance, as you have shown yourself. The performance argument does not
convince me either.

--
assignee:  - lars.gustaebel
nosy: +lars.gustaebel

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-05 Thread Sridhar Ratnakumar

Sridhar Ratnakumar sridh...@activestate.com added the comment:

[Lars] (...) We talk about a very small number of cases here and the
generator_tools-0.3.5.tar.gz is really broken beyond repair. It is the
only thing that should be fixed here IMO ;-)

Sure, that is what the pyopenssl folks did - fix their tarball. However,
it is reasonable expect certain tarballs to be 'broken beyond repair'
when you are running tarfile.extracall over a huge number of tarballs
such as the ones in PyPI.

Indeed, the tarfile module already has several fixes for such 'broken'
cases, Viz:

[quote]'If ignore_zeros is False, treat an empty block as the end of the
archive. If it is True, skip empty (and invalid) blocks and try to get
as many members as possible. This is only useful for reading
concatenated or **damaged** archives.'[endquote] [emphasis added]

[quote]'(...)Directory information like owner, modification time and
permissions are set after all members have been extracted. This is done
to work around two problems: A directory’s modification time is reset
each time a file is created in it. And, if a directory’s **permissions
do not allow writing**, extracting files to it will fail'[endquote]
[emphasis added]

[Lars] I agree with David here. It is easy to manipulate the tarfile in
advance, as you have shown yourself. The performance argument does not
convince me either.

Ok. Can you comment on this argument?

[quote]'(...)the very reason to write a program to extract tarball
(instead of doing it manually) is to automate it .. which automation is
*more effective and simple* if ``extractall`` had a flag such as
readaccess=True'[endquote] (emphasis added)

[quote]'I just think it is not simple (as in, keeping the code off from
such hacks that are tangential to the problem being solved) and
effective (as in, not having to deal with potential unintended side
effects like bugs in the post-fix chmoding or in the pre-fix tarinfo
mode modifications).'[endquote]

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-05 Thread Lars Gustäbel

Lars Gustäbel l...@gustaebel.de added the comment:

Sure, tarfile contains numerous work-arounds for quirky and buggy
archives. Otherwise, it would not be usable in real-life.

But we should not mix up different issues here. tarfile reads and
extracts your generator_tools.tar just fine. Formally, the data is okay.
It's the stored information that is useless and that you are not happy
with. But as we both agree it is rather simple to fix this information
in advance:

import tarfile
tar = tarfile.open(generator_tools-0.3.5.tar.gz)
for t in tar:
if t.isdir():
t.mode = 0755
else:
t.mode = 0644
tar.extractall()
tar.close()

Sure, there is some functionality in extractall() that addresses issues
with inappropriate permissions, but without this functionality the
archive would not even *extract* cleanly. That is very different from
your problem.

In my opinion, the code above illustrates quite well, that tarfile was
designed to be high-level and flexible at the same time. Make use of
that. I honestly think that extractall() can do well without a
readaccess argument.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-05 Thread Sridhar Ratnakumar

Sridhar Ratnakumar sridh...@activestate.com added the comment:

[Lars] Sure, there is some functionality in extractall() that addresses
issues with inappropriate permissions, but without this functionality the
archive would not even *extract* cleanly. That is very different from
your problem.

Fair enough.

'tis time to creating a pypi package out of my high-level wrapper:

  http://gist.github.com/124597

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-04 Thread Sridhar Ratnakumar

Sridhar Ratnakumar sridh...@activestate.com added the comment:

Here's a test data from PyPI:
http://pypi.python.org/packages/source/g/generator_tools/generator_tools-0.3.5.tar.gz

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-04 Thread Sridhar Ratnakumar

Sridhar Ratnakumar sridh...@activestate.com added the comment:

Considering this bug where tarfile fails to set g+s,

  https://bugs.launchpad.net/pyopenssl/+bug/236190

a more general approach could be:

  tarfile.extractall(safe_perms=True)

where if safe_perms is set, tarfile can 1) ignore +/-s on all files, 2)
ignore u-x on directories and 3) ignore u-r on files.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-04 Thread Sridhar Ratnakumar

New submission from Sridhar Ratnakumar sridh...@activestate.com:

If a tarball has a-x perms set on its root directory, one cannot access
its contents. 

$ tar zxf generator_tools-0.3.5.tar.gz.
$ ls generator_tools-0.3.5/
ls: cannot access generator_tools-0.3.5/README.txt: Permission denied
...
sridh...@double:/tmp/i$ 

This is fine for GNU tar (the user can always do a chmod +x later). But
for the tarfile library, it would be better to have a flag such as
readaccess=True that will force ``extractall`` to enforce *minimum*
permissions required for the basic read access. This means, tarfile
would ignore u-x on directories and u-r on files.

The reason I make this feature request (instead of working around the
issue myself in a verbose way) is that the very reason to write a
program to extract tarball (instead of doing it manually) is to automate
it .. which automation is more effective and simple if ``extractall``
had a flag such as readaccess=True.

--
components: Library (Lib)
messages: 88908
nosy: srid
severity: normal
status: open
title: tarfile.extractall(readaccess=True)
type: feature request
versions: Python 2.6, Python 2.7, Python 3.0, Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-04 Thread R. David Murray

R. David Murray rdmur...@bitdance.com added the comment:

I don't see why the tarfile case should be different from the tar case.
 You can always chmod it later in python, too (with os.walk and
os.chmod).  Perhaps the real need is for a recursive chmod in shutil?

--
nosy: +r.david.murray
priority:  - low

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue6196] tarfile.extractall(readaccess=True)

2009-06-04 Thread Sridhar Ratnakumar

Sridhar Ratnakumar sridh...@activestate.com added the comment:

[David] I don't see why the tarfile case should be different from the
tar case. (...)

As I explained, Viz:

[quote]'(...)the very reason to write a program to extract tarball
(instead of doing it manually) is to automate it .. which automation is
*more effective and simple* if ``extractall`` had a flag such as
readaccess=True'[endquote] (emphasis added)

[David] You can always chmod it later in python, too (with os.walk and
os.chmod). (...)

Of course, I can. Or:

EXECUTE = 0100
READ = 0400
dir_perm = EXECUTE
file_perm = EXECUTE | READ
for tarinfo in f.getmembers():
tarinfo.mode |= (dir_perm if tarinfo.isdir() else file_perm)

As you can see, for a tarfile with huge list of files.. this can be a
performance issue.

[David] (...) Perhaps the real need is for a recursive chmod in shutil?

The real need is to fix the weird permissions on some tarballs (such as
generator_tools-0.3.5.tar.gz in PyPI and the above mentioned pyopenssl
tarball).

This need usually leads to designing workarounds. 

I just think it is not simple (as in, keeping the code off from such
hacks that are tangential to the problem being solved) and effective (as
in, not having to deal with potential unintended side effects like bugs
in the post-fix chmoding or in the pre-fix tarinfo mode modifications).

Hence the feature request.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue6196
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com