Re: [Python-Dev] Python 2.7, long double vs allocator alignment, GCC 8 on x86-64

2018-01-30 Thread Florian Weimer
* Gregory P. Smith:

> The proper fix for this in the code would likely break ABI compatibility
> (ie: not possible in python 2.7 or any other stable release).
>
> Clang's UBSAN (undefined behavior sanitizer) has been flagging this one for
> a long time.
>
> In Python 3 a double is used instead of long double since 2012 as I did
> some digging at the time:
> https://github.com/python/cpython/commit/e348c8d154cf6342c79d627ebfe89dfe9de23817

A slightly more ABI-safe version of that change looks like this:

diff --git a/Include/objimpl.h b/Include/objimpl.h
index 55e83eced6..aa906144dc 100644
--- a/Include/objimpl.h
+++ b/Include/objimpl.h
@@ -248,6 +248,18 @@ PyAPI_FUNC(PyVarObject *) _PyObject_GC_Resize(PyVarObject 
*, Py_ssize_t);
 /* for source compatibility with 2.2 */
 #define _PyObject_GC_Del PyObject_GC_Del
 
+/* Former over-aligned definition of PyGC_Head, used to compute the
+   size of the padding for the new version below. */
+union _gc_head;
+union _gc_head_old {
+struct {
+union _gc_head *gc_next;
+union _gc_head *gc_prev;
+Py_ssize_t gc_refs;
+} gc;
+long double dummy;
+};
+
 /* GC information is stored BEFORE the object structure. */
 typedef union _gc_head {
 struct {
@@ -255,7 +267,8 @@ typedef union _gc_head {
 union _gc_head *gc_prev;
 Py_ssize_t gc_refs;
 } gc;
-long double dummy;  /* force worst-case alignment */
+double dummy;  /* force worst-case alignment */
+char dummy_padding[sizeof(union _gc_head_old)];
 } PyGC_Head;
 
 extern PyGC_Head *_PyGC_generation0;

This preserves the offset used by _Py_AS_GC in case it has been built
into existing binaries.  It may be more appropriate to do it this way
for Python 2.7.  I think it's also more conservative than the
allocator changes.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python 2.7, long double vs allocator alignment, GCC 8 on x86-64

2018-01-30 Thread Florian Weimer
I hope this is the right list for this kind of question.  We recently
tried to build Python 2.6 with GCC 8, and ran into this issue:

  

Also quoting for context:

| PyInstance_NewRaw contains this code:
| 
| inst = PyObject_GC_New(PyInstanceObject, _Type);
| if (inst == NULL) {
| Py_DECREF(dict);
| return NULL;
| }
| inst->in_weakreflist = NULL;
| Py_INCREF(klass);
| inst->in_class = (PyClassObject *)klass;
| inst->in_dict = dict;
| _PyObject_GC_TRACK(inst);
| 
| _PyObject_GC_TRACK expands to:
| 
| #define _PyObject_GC_TRACK(o) do { \
| PyGC_Head *g = _Py_AS_GC(o); \
| if (g->gc.gc_refs != _PyGC_REFS_UNTRACKED) \
| Py_FatalError("GC object already tracked"); \
| …
| 
| Via:
| 
| #define _Py_AS_GC(o) ((PyGC_Head *)(o)-1)
| 
| We get to this:
| 
| /* GC information is stored BEFORE the object structure. */
| typedef union _gc_head {
| struct {
| union _gc_head *gc_next;
| union _gc_head *gc_prev;
| Py_ssize_t gc_refs;
| } gc;
| long double dummy;  /* force worst-case alignment */
| } PyGC_Head;
| 
| PyGC_Head has 16-byte alignment.  The net result is that
| 
| _PyObject_GC_TRACK(inst);
| 
| promises to the compiler that inst is properly aligned for the
| PyGC_Head type, but it is not: PyObject_GC_New returns a pointer which
| is only 8-byte-aligned.
| 
| Objects/obmalloc.c contains this:
| 
| /*
|  * Alignment of addresses returned to the user. 8-bytes alignment works
|  * on most current architectures (with 32-bit or 64-bit address busses).
|  * The alignment value is also used for grouping small requests in size
|  * classes spaced ALIGNMENT bytes apart.
|  *
|  * You shouldn't change this unless you know what you are doing.
|  */
| #define ALIGNMENT   8   /* must be 2^N */
| #define ALIGNMENT_SHIFT 3
| #define ALIGNMENT_MASK  (ALIGNMENT - 1)
| 
| So either the allocator alignment needs to be increased, or the
| PyGC_Head alignment needs to be decreased.

Is this a known issue?  As far as I can see, it has not been fixed on
the 2.7 branch.

(Store merging is a relatively new GCC feature.  Among other things,
this means that on x86-64, for sufficiently aligned pointers, vector
instructions are used to update multiple struct fields at once.  These
vector instructions can trigger alignment traps, similar to what
happens on some other architectures for scalars.)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop support for old unsupported FreeBSD and Linux kernels?

2018-01-20 Thread Florian Weimer
* Victor Stinner:

> CPython still has compatibility code for Linux 2.6, whereas the
> support of Linux 2.6.x ended in August 2011, longer than 6 years ago.

There are still reasonably widely used 2.6 kernels under support, but
they have lots of (feature) backports, so maybe they do not need the
2.6.32 workarounds you plan to remove.

(glibc upstream nowadays requires Linux 3.2 (stable branch) as the
minimum, but then people are less likely to update glibc on really old
systems.)

> Should we also drop support for old Linux kernels? If yes, which ones?
> The Linux kernel has LTS version, the oldest is Linux 3.2 (support
> will end in May, 2018).

What exactly do you plan to change?  Is it about unconditionally
assuming accept4 support?  accept4 support was added to Linux 3.3 on
ia64.  This is not uncommon: The first version in which a particular
(generic) system call is available varies a lot between architectures.
You'll have to investigate each case separately.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-28 Thread Florian Weimer
* Stefan Behnel:

 Martin v. Löwis, 24.01.2011 21:17:
 The Py_UNICODE type is still supported but deprecated. It is always
 defined as a typedef for wchar_t, so the wstr representation can double
 as Py_UNICODE representation.

 It's too bad this isn't initialised by default, though. Py_UNICODE is
 the only representation that can be used efficiently from C code

Is this really true?  I don't think I've seen any C API which actually
uses wchar_t, beyond that what is provided by libc.  UTF-8 and even
UTF-16 are much, much more common.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393: Flexible String Representation

2011-01-28 Thread Florian Weimer
* Stefan Behnel:

 The nice thing about Py_UNICODE is that is basically gives you native
 Unicode code points directly, without needing to decode UTF-8 byte
 runs and the like. In Cython, it allows you to do things like this:

 def test_for_those_characters(unicode s):
 for c in s:
 # warning: randomly chosen Unicode escapes ahead
 if c in u\u0356\u1012\u3359\u4567:
 return True
 else:
 return False

 The loop runs in plain C, using the somewhat obvious implementation
 with a loop over Py_UNICODE characters and a switch statement for the
 comparison. This would look a *lot* more ugly with UTF-8 encoded byte
 strings.

Not really, because UTF-8 is quite search-friendly.  (The if would
have to invoke a memmem()-like primitive.)  Random subscrips are
problematic.

However, why would one want to write loops like the above?  Don't you
have to take combining characters (comprising multiple codepoints)
into account most of the time when you look at individual characters?
Then UTF-32 does not offer much of a simplification.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 2.7 beta 2

2010-05-09 Thread Florian Weimer
* Benjamin Peterson:

 http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python
 distribution.

Something is missing here: * Multiple context managers in
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Retrieve an arbitrary element from a set without removing it

2009-10-25 Thread Florian Weimer
* James Y. Knight:

 On Oct 25, 2009, at 2:50 AM, Terry Reedy wrote:

 Alex Martelli wrote:
 Next(s) would seem good...

 That does not work. It has to be next(iter(s)), and that has been
 tried and eliminated because it is significantly slower.

 But who cares about the speed of getting an arbitrary element from a
 set? How can it *possibly* be a problem in a real program?

Hmm, perhaps when using sets as work queues?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL, Python 3, and MP vs. UP

2005-09-19 Thread Florian Weimer
* Martin Blais:

 http://www.gotw.ca/publications/concurrency-ddj.htm
 The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software
 Herb Sutter
 March 2005

This piece is fundamentally wrong.  We all have been writing
concurrent server-side software for eons.  I don't know what Herb was
thinking when he wrote that piece.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL, Python 3, and MP vs. UP

2005-09-19 Thread Florian Weimer
* Guido van Rossum:

 That assumes a very specific model for how all that MP power is going
 to be used.

Indeed.

 I personally don't think the threaded programming model as found in
 Java works all that well; without locks you end up with concurrent
 modification errors, with locks you get deadlocks and livelocks.

Java is bascially forced into that model because VM startup costs are
so high.  To some extent, Python has similar problems, but you don't
have to care about preserving class loader semantics, so you should be
in a better position to cut down process creation time.

 Be my guest. Prove me wrong. Talk is cheap; instead of arguing my
 points (all of which can be argued ad infinitum), come back when
 you've got a working GIL-free Python. Doesn't have to be CPython-based
 -- C# would be fine too.

By the way, has anybody ever tried to create a CPython variant which
uses a (mostly) copying garbage collector (or something else except
reference counting or Boehm GC)?

Copying GC might help to get rid of the GIL *and* improve performance
in the accept+fork model (because read-only object access does not
trigger copy-on-write anymore).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] GIL, Python 3, and MP vs. UP

2005-09-19 Thread Florian Weimer
* Michael Hudson:

 Not to my knowledge.  I've always thought that it would be pretty
 hard.  I'd be interested in being proved wrong.

The real problem is that you can ditch most extension modules. 8-(

It sounds more like a fun project for the Python core, though.

 Copying GC might help to get rid of the GIL *and* improve performance
 in the accept+fork model (because read-only object access does not
 trigger copy-on-write anymore).

 How does a copying gc differ much from a non-copying non-refcounted gc
 here?

You could copy immutable objects to a separate set of pages and never
collect them (especially if recursively refer to immutable objects
only).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] removing nested tuple function parameters

2005-09-18 Thread Florian Weimer
* Brett Cannon:

 Is anyone truly attached to nested tuple function parameters; ``def
 fxn((a,b)): print a,b``?  At one of the PyCon sprints Guido seemed
 okay with just having them removed when Jeremy asked about ditching
 them thanks to the pain they caused in the AST branch.

Will

  def fxn((a,b,),): print a,b

continue to work?  (I'm not sure if the problems are syntax or
representation of the parse tree; I don't know what's going on on that
AST branch.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com