[Python-Dev] Non-stable pyc results on python 3.6

2017-07-27 Thread jan matejek
hello,
we're seeing strange problems when trying to do reproducible builds of some 
python 3.6 modules.

Namely, from one build to another, there will be something like the following 
difference in the
compiled object:

 4e40  da 07 5f 5f 61 6c 6c 5f  5f da 0a 5f 5f 61 75 74  |..__all__..__aut|
-4e50  68 6f 72 5f 5f da 07 64  65 63 69 6d 61 6c 72 0c  |hor__..decimalr.|
+4e50  68 6f 72 5f 5f 5a 07 64  65 63 69 6d 61 6c 72 0c  |hor__Z.decimalr.|
 4e60  00 00 00 72 43 00 00 00  72 08 00 00 00 72 41 00  |...rC...rrA.|

This specific one is in the top-level co_names segment and the 0x5a vs 0xda 
byte is
TYPE_SHORT_ASCII_INTERNED, with FLAG_REF set or unset.

I'm also seeing off-by-one differences in reference ids, i.e., the number 
appearing after TYPE_REF.
Not in all cases, but it seems that when a "part" is affected, all references 
in that "part" are
changed (for some value of "part"; all the knowledge of pycs I have was gained 
from about an hour of
reading marshal.c). So that seems to imply that there's a reference that is 
sometimes included and
sometimes not?

This is most often found in __init__.py. Often this affects optimized pycs, but 
we can see it in
un-optimized as well.
The issue is rare -- 99% of all pycs are stable -- but when it occurs, it's 
easy to replicate it in
the same place. This also happens on different machines, so that seems to rule 
out hardware memory
errors :)

The pycs in question are generated by normal "setup.py build" -> "setup.py 
install". It happens on
Python 3.6 but not on Python 2.7. I'm not sure about Python 3.5 because we 
don't currently use it.

It doesn't seem to depend on hash seed - the instability is observed even with 
PYTHONHASHSEED set to
zero. What seems to fix it, however, is running the build on disorderfs, which 
ensures that the
filesystem entries are in the same order.

Any ideas why something like this would happen and why would it be correlated 
with filesystem ordering?

thanks
m.



signature.asc
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] please consider changing --enable-unicode default to ucs4

2009-10-05 Thread Jan Matejek


Dne 20.9.2009 18:42, Antoine Pitrou napsal(a):
 Le Sun, 20 Sep 2009 10:33:23 -0600, Zooko O'Whielacronx a écrit :

 By the way, I was investigating this, and discovered an issue on the
 Mandriva tracker which suggests that they intend to switch to UCS4 in
 the next release in order to avoid compatibility problems like these.
 
 Trying to use a Fedora or Suse RPM under Mandriva (or the other way 
 round) isn't reasonable and is certainly not supported.
 I don't understand why this so-called compatibility problem should be 
 taken seriously by anyone.

You're not making sense. No distro is an island - plus, upstream
distributors have a nasty habit of providing RPMs only for Fedora. I
don't see what is bad about improving compatibility in a place where the
setting doesn't hurt one way or the other.
Besides, the more compatibility we achieve now, the easier time we'll
have once python makes it into LSB

regards
m.

 
 Regards
 
 Antoine.
 
 
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/jmatejek%40suse.cz
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] request for comments - standardization of python's purelib and platlib

2009-08-14 Thread Jan Matejek
Dne 13.8.2009 21:22, Brett Cannon napsal(a):
 On Thu, Aug 13, 2009 at 11:23, Jan Matejek jan.mate...@novell.com wrote:
 1 - the traditional way
 purelib = /usr/lib/pythonX.Y/site-packages
 platlib = /usr/lib(64)/pythonX.Y/site-packages

 
 Why can't pure libraries go into lib64 as well? There is nothing saying that
 a pure Python package won't have a setup.py that installs different files
 based on whether it is for a 32-bit or 64-bit CPython install.

What i'd like to accomplish is to have pure noarch package that can be
installed unchanged into 32bit or 64bit (or 256bit) system, and the
respective python would still find the files.
Or, to put it another way, a package that can be installed into a
multiarch system and be recognized by pythons of all architectures
(assuming they are the same version, of course).

If the distutils package installs different pure files for 32bit and
64bit python, then it can't be noarch, so it doesn't matter if it goes
into lib64.

Also, such package would break this particular scheme - in the situation
where the user installs only 32bit version of such package and tries to
run it with 64bit python, it will probably break in some weird way.

Last but not least, i'd argue that if a python-only package installs
different files for different platforms, it is platform-dependent and
therefore not pure ;)


 2 - the sharedir way
 purelib = /usr/share/python/X.Y
 platlib = /usr/lib(64)/pythonX.Y/site-packages
 
 
 Now are you proposing that packages that have both Python source and
 extensions be split based on the type of files, or that only pure Python
 packages go to /usr/share/python and any packages that are mixed go into
 lib(64)? If you are proposing the latter this is more reasonable as the
 former will require using .pth files to get import to search both locations
 for files in the same package and that just feels icky to me.

The latter. Assume no change to normal distutils mechanism, only
setting the default paths. (for now anyway)

regards
m.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] request for comments - standardization of python's purelib and platlib

2009-08-13 Thread Jan Matejek
Hello,

I'm cross-posting this to distributi...@freedesktop and python-dev,
because the topic is relevant to both groups and should be solved in
cooperation.

The issue:

In Python's default configuration (on linux), both purelib (location for
pure python modules) and platlib (location for platform-dependent binary
extensions) point to $prefix/lib/pythonX.Y/site-packages.
That is no good for two main reasons.

One, python depends on the lib directory. (from distro's point of
view, prefix is /usr, so let's talk /usr/lib) Due to this, it's
impossible to install python under /usr/lib64 without heavy patching.
Repeated attempts to bring python developers to acknowledge and rectify
the situation have all failed (common argument here is that would mean
redesign of distutils and huge parts of whatnot).

Conversely, that also means that multiarch setup (/usr/lib or lib32 with
32bit python and /usr/lib64 with 64bit python) is not possible with
stock python.

Two, the default configuration makes purelib and platlib identical,
which somehow defeats the purpose of the distinction in the first place.
You either need to patch the default, or supply some alternate
configuration to take advantage of this feature.
And that's not the end of it - the next step is to make python aware of
two different locations on sys.path, one for purelib and one for
platlib, which is a different story altogether.

As distributors, we like to take advantage of purelib/platlib separation
to package pure python modules as platform-independent (noarch for
rpm-speakers). And that's not easy to do properly.

The proposal:

Let's put our heads together and choose good default locations for
purelib and platlib. Then add support to python for recognizing the
locations by default, and possibly leave note in FHS that this is the
place.

This is IMO a good first step to making python multiarch-aware, and it
would also help a bit with LSB integration [1].

I've come up with three basic options for the configuration (substitute
/usr with $prefix if you're not a distributor). This list is by no
means comprehensive, it's just what looked reasonable at the time of
writing.

1 - the traditional way
purelib = /usr/lib/pythonX.Y/site-packages
platlib = /usr/lib(64)/pythonX.Y/site-packages

pros:
+ this is already the default for 32bit systems
+ major distributions (including Fedora, Mandriva and now finally
openSUSE too) do this
cons:
- 32bit systems have no separation, poor they!
- with multiarch setup, /usr/lib is cluttered by both
platform-dependent files for 32bit and platform-independent files shared
by the platforms. Also, 64bit python can pick up 32bit modules. That
doesn't cause problems in practice, but doesn't fell like a clean design.

2 - the sharedir way
purelib = /usr/share/python/X.Y
platlib = /usr/lib(64)/pythonX.Y/site-packages

pros:
+ clean separation of purelib - nice!
+ unheard of - a good place to start anew
cons:
- FHS states that /usr/share is for data. But OTOH, they don't say much
about platform-independent bytecode. We could probably get an exception
for this.
- unheard of - everyone will be surprised

3 - the perl way
purelib = /usr/lib/pythonX.Y
platlib = /usr/lib/pythonX.Y/lib-dynload-(platform-identifier)/site-packages

pros:
+ possibility of multiarch packages that would install pure python parts
into purelib and extensions or accelerators for more platforms at once -
and therefore, possibility to split large modules into
platform-dependent and platform-independent parts and save space on
installation media
+ idea compatibility with perl and ruby, one less install layout to learn
cons:
- completely different from what we have now - would require the most
work from both python developers and distributions

comments?

regards
jan matejek
python packager for SUSE Linux

[1] http://www.linuxfoundation.org/en/LsbPython

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python security team

2008-09-29 Thread Jan Matejek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brett Cannon napsal(a):
 On Sat, Sep 27, 2008 at 8:54 AM, Victor Stinner
 [EMAIL PROTECTED] wrote:
 First, I would like to access to these informations. Not only this issue, but
 all security related issues. I have some knowledges about security and I can
 help to resolve issues and/or estimate the criticity of an issue.

 
 That would require commit privileges first. Don't know if the group
 requires that a person have a decent amount of time committing to the
 core first (I just joined the list in late July).

commit privileges?
I would be interested in joining the PSRT list too - as a python
maintainer for openSUSE, i think that it would be beneficial for both my
and your work. And i can imagine that maintainers from other
distributions have similar opinion on this ;)
And that does not necessarily mean commit privileges, right?

Or is this an issue of trust, where we trust you enough to make changes
to the core equals we also trust you enough to see the security issues ?

regards
jan matejek
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iEYEARECAAYFAkjgxgsACgkQjBrWA+AvBr+8IACfdh6ia9btlB4YrD+FI49CI5rv
8PcAoKQJVdie4YKDzLxaJCE33/TakcdW
=Y8Ck
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] tarfile and directory traversal vulnerability

2007-08-27 Thread Jan Matejek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Martin v. Löwis wrote:
 I must admit I fail to see the bug. If root untars a file, and that tar
 file contains an instruction to overwrite /etc/passwd, why is an error
 to execute that instruction? Shouldn't root just be more careful when
 untaring files?

GNU tar is not supposed to place files outside its working directory,
unless explicitly specified otherwise. So this is considered a security
vulnerability.

AFAIK there is no specified behavior and other tars might act
differently. But i think GNU tar behaves correctly in this regard.

Furthermore, extract() and extractall() documentation says Extract
(...) from the archive to the *current working directory* or directory
[path].
So current behavior is actually inconsistent with the documentation.

 if tarinfo.name.startswith('../'):
 self.extract(tarinfo, path)
 else:
 warnings.warn(non-local file skipped: %s % tarinfo.name,
 RuntimeWarning, stacklevel=1)
 
 Ok. You seem to be claiming that the tarfile is incorrect in some
 sense. Can you please point to some spec that says this is an incorrect
 tarfile?

No, the tar file itself is correct, according to POSIX. You can put
anything into a tar. Point is, you should be able to untar any file
'safely'.

 In any case, if you fix what you consider broken, you should do
 it exactly the same way as GNU tar does it (assuming you consider
 GNU tar fixed).

I can do that.
I would propose an optional parameter for extract() and extractall(),
absolutePaths, defaulting to False. When encountering a non-local file,
it would strip the leading slash or the path up to the last '../'
sequence (that is what GNU tar does) and extract the file locally.
Setting absolutePaths to True would restore current behavior (no checks).

regards,
jan matejek
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFG0wtkjBrWA+AvBr8RAmmnAKCtpYYoFZYaNwba2WW11NtRuCyqhwCePkFw
9M2pKHtu0O62fAYfb8NTm3A=
=yfVK
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] tarfile and directory traversal vulnerability

2007-08-27 Thread Jan Matejek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Lars Gustäbel wrote:
 Suppose we have:
 foo - /etc
 foo/passwd
 
 If creation of the foo symlink is delayed, foo/passwd will be
 extracted in a directory foo which will be created implicitly.
 If we create the foo symlink afterwards it will fail because foo
 already exists. The best way would be to completely ignore
 members and link targets that are absolute or outside the
 archive's scope.

GNU tar doesn't descend into symlinked directories when extracting, such
archive fails anyway:

# tar xvf foo.tar
foo
foo/passwd
tar: foo/passwd: Cannot open: Not a directory
tar: Error exit delayed from previous errors

I think that is the simplest solution, but i'm not sure how to best
implement that in extractall().
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFG0wyUjBrWA+AvBr8RAjkJAKCJS+hkV1HYL9egOsyeTE5vj44r5ACeNmt7
HquYw+ON+5qVNoC778OtQRE=
=9Kx/
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] tarfile and directory traversal vulnerability

2007-08-24 Thread Jan Matejek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

hi,
once upon a time there was a known vulnerability in tar (CVE-2001-1267,
[1]), and while tar is now long fixed, python's tarfile module is
affected too.

The vulnerability goes basically like this: If you tar a file named
../../../../../etc/passwd and then make the admin untar it,
/etc/passwd gets overwritten.
Another variety of this bug is a symlink one: if tar contains files like:
./-directory - /etc
./-directory/passwd
then the -directory symlink would be created first and /etc/passwd
will be overwritten once again.

I was wondering how to fix it.
The symlink problem obviously applies only to extractall() method and is
easily fixed by delaying external (or possibly all) symlink creation,
similar to how directory attributes are delayed now.
I've attached a draft of the patch, if you like it, i'll polish it.

The traversal problem is harder, and it applies to extract() method as well.
For extractall() alone, i would use something like:

if tarinfo.name.startswith('../'):
self.extract(tarinfo, path)
else:
warnings.warn(non-local file skipped: %s % tarinfo.name,
RuntimeWarning, stacklevel=1)

For extract(), i am not sure. Maybe it should throw exception when it
encounters such file, and have a special option to extract such files
anyway. Or maybe it should be left alone altogether.

Any suggestions are welcome.

regards
jan matejek

[1] http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2001-1267
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFGzxcpjBrWA+AvBr8RAlduAKCk0iiSoBF+wA9xgXmDlpWsECZ7KgCfQORg
lZ85inT1FGwhGqBfxJvCGGU=
=TiWx
-END PGP SIGNATURE-
--- Lib/tarfile.py
+++ Lib/tarfile.py
@@ -1503,6 +1503,7 @@
list returned by getmembers().
 
 directories = []
+symlinks = []
 
 if members is None:
 members = self
@@ -1516,6 +1517,9 @@
 except EnvironmentError:
 pass
 directories.append(tarinfo)
+elif tarinfo.issym() and (tarinfo.linkpath.startswith('../') or tarinfo.linkpath.startswith('/')):
+# external symlink is delayed
+symlinks.append(tarinfo)
 else:
 self.extract(tarinfo, path)
 
@@ -1536,6 +1540,12 @@
 else:
 self._dbg(1, tarfile: %s % e)
 
+# Handle external symlinks
+symlinks.sort(lambda a, b: cmp(a.name, b.name))
+symlinks.reverse()
+for tarinfo in symlinks:
+self.extract(tarinfo, path)
+
 def extract(self, member, path=):
 Extract a member from the archive to the current working directory,
using its full name. Its file information is extracted as accurately
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python and the Linux Standard Base (LSB)

2006-11-27 Thread Jan Matejek
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Phillip J. Eby napsal(a):
 Just a suggestion, but one issue that I think needs addressing is the FHS
 language that leads some Linux distros to believe that they should change
 Python's normal installation layout (sometimes in bizarre ways) (...)
 Other vendors apparently also patch Python in various
 ways to support their FHS-based theories of how Python should install
 files.

+1 on that. There should be a clear (and clearly presented) idea of how
Python is supposed to be laid out in the distribution-provided /usr
hierarchy. And it would be nice if this idea complied to FHS.

It would also be nice if somebody finally admitted the existence of
/usr/lib64 and made Python aware of it ;e)

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFFaupFjBrWA+AvBr8RArJcAKCGbeoih7TwKp2tBHtV3RMoY4JqvQCeJq87
+RgREnCI7DM/G5MNtjqmdVI=
=WHpB
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com