-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

hi,
once upon a time there was a known vulnerability in tar (CVE-2001-1267,
[1]), and while tar is now long fixed, python's tarfile module is
affected too.

The vulnerability goes basically like this: If you tar a file named
"../../../../../etc/passwd" and then make the admin untar it,
/etc/passwd gets overwritten.
Another variety of this bug is a symlink one: if tar contains files like:
./aaaa-directory -> /etc
./aaaa-directory/passwd
then the "aaaa-directory" symlink would be created first and /etc/passwd
will be overwritten once again.

I was wondering how to fix it.
The symlink problem obviously applies only to extractall() method and is
easily fixed by delaying external (or possibly all) symlink creation,
similar to how directory attributes are delayed now.
I've attached a draft of the patch, if you like it, i'll polish it.

The traversal problem is harder, and it applies to extract() method as well.
For extractall() alone, i would use something like:

if tarinfo.name.startswith('../'):
    self.extract(tarinfo, path)
else:
    warnings.warn("non-local file skipped: %s" % tarinfo.name,
RuntimeWarning, stacklevel=1)

For extract(), i am not sure. Maybe it should throw exception when it
encounters such file, and have a special option to extract such files
anyway. Or maybe it should be left alone altogether.

Any suggestions are welcome.

regards
jan matejek

[1] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2001-1267
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGzxcpjBrWA+AvBr8RAlduAKCk0iiSoBF+wA9xgXmDlpWsECZ7KgCfQORg
lZ85inT1FGwhGqBfxJvCGGU=
=TiWx
-----END PGP SIGNATURE-----
--- Lib/tarfile.py
+++ Lib/tarfile.py
@@ -1503,6 +1503,7 @@
            list returned by getmembers().
         """
         directories = []
+        symlinks = []
 
         if members is None:
             members = self
@@ -1516,6 +1517,9 @@
                 except EnvironmentError:
                     pass
                 directories.append(tarinfo)
+            elif tarinfo.issym() and (tarinfo.linkpath.startswith('../') or tarinfo.linkpath.startswith('/')):
+                # external symlink is delayed
+                symlinks.append(tarinfo)
             else:
                 self.extract(tarinfo, path)
 
@@ -1536,6 +1540,12 @@
                 else:
                     self._dbg(1, "tarfile: %s" % e)
 
+        # Handle external symlinks
+        symlinks.sort(lambda a, b: cmp(a.name, b.name))
+        symlinks.reverse()
+        for tarinfo in symlinks:
+            self.extract(tarinfo, path)
+
     def extract(self, member, path=""):
         """Extract a member from the archive to the current working directory,
            using its full name. Its file information is extracted as accurately
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to