Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Christian Heimes
Am 14.03.2013 03:05, schrieb Trent Nelson:
> Just posted the slides for those that didn't have the benefit of
> attending the language summit today:
> 
> 
> https://speakerdeck.com/trent/parallelizing-the-python-interpreter-an-alternate-approach-to-async

Wow, neat! Your idea with Py_PXCTC is ingenious.

As far as I remember the FS and GS segment registers are used by most
modern operating systems on x86 and x86_64 platforms nowadays to
distinguish threads. TLS is implemented with FS and GS registers. I
guess the __read[gf]sdword() intrinsics do exactly the same. Reading
registers is super fast and should have a negligible effect on code.

ARM CPUs don't have segment registers because they have a simpler
addressing model. The register CP15 came up after a couple of Google
searches.

IMHO you should target x86, x86_64, ARMv6 and ARMv7. ARMv7 is going to
be more important than x86 in the future. We are going to see more ARM
based servers.

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Antoine Pitrou
Le Thu, 14 Mar 2013 13:21:09 +0100,
Christian Heimes  a écrit :
> 
> IMHO you should target x86, x86_64, ARMv6 and ARMv7. ARMv7 is going to
> be more important than x86 in the future.  We are going to see more
> ARM based servers.

Well we can't really see less of them, since there are hardly any ;-)

Related reading:
http://www.anandtech.com/show/6757/calxedas-arm-server-tested

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread a . cavallo
By the way on the arm (and any platform that can do cross-compiling) 
I've created a Makefile based build of the python 2.7.x:


  https://bitbucket.org/cavallo71/android

Please don't be fooled by the Android name, it really can take any 
crosscompiler (provided it follows the gcc synatx).


It was born out of the frustration with trying to adapt ./configure to 
do cross compiling. It is a sliglty different update to the problem as 
tried by the kiwy project for example.


I hope this helps,
Antonio


On 2013-03-14 13:38, Antoine Pitrou wrote:

Le Thu, 14 Mar 2013 13:21:09 +0100,
Christian Heimes  a écrit :


IMHO you should target x86, x86_64, ARMv6 and ARMv7. ARMv7 is going 
to

be more important than x86 in the future.  We are going to see more
ARM based servers.


Well we can't really see less of them, since there are hardly any ;-)

Related reading:
http://www.anandtech.com/show/6757/calxedas-arm-server-tested



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] VC++ 2008 Express Edition now locked away?

2013-03-14 Thread Martin v. Löwis

Am 07.03.13 09:53, schrieb Steve Dower:

To use the SDK compiler, you need to do a few manual steps
first.

After starting a command window, you need to run a batch file
to configure your environment. Choose the appropriate option
from

C:\Program Files (x86)\Microsoft Visual Studio
9.0\VC\bin\vcvars64.bat

or

C:\Program Files (x86)\Microsoft Visual Studio
9.0\VC\bin\vcvars32.bat

Then set two environment variables:

set MSSdk=1 set DISTUTILS_USE_SDK=1

After these steps, the standard python setup.py install should
work.


This may be fine for building extensions, but it appears that more
instructions are needed for a novice to build python itself.


I'm not even sure that these variables are necessary - certainly
without the compilers installed setup.py looks in the right place for
them. I'll try this as well.


Setting MSSdk shouldn't be necessary, as vcvars should already have set it
(unless that changed in recent SDKs). Setting DISTUTILS_USE_SDK is 
necessary as a protection to avoid unintionally picking up the wrong

build tools.

As for distutils finding them automatically: this only works for finding
VS installations. It is (AFAICT) not possible to automatically locate
SDK installations (other than by exhaustive search of the disk).


As for the documentation, I'd be happy to provide an update for this
section once I've checked out that everything works.


I think it should explain to to invoke msbuild, in addition to 
explaining how to plug old compilers into new IDEs.


Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
On Thu, Mar 14, 2013 at 05:21:09AM -0700, Christian Heimes wrote:
> Am 14.03.2013 03:05, schrieb Trent Nelson:
> > Just posted the slides for those that didn't have the benefit of
> > attending the language summit today:
> > 
> > 
> > https://speakerdeck.com/trent/parallelizing-the-python-interpreter-an-alternate-approach-to-async
> 
> Wow, neat! Your idea with Py_PXCTC is ingenious.

Yeah, it's funny how the viability and performance of the whole
approach comes down to a quirky little trick for quickly detecting
if we're in a parallel thread ;-)  I was very chuffed when it all
fell into place.  (And I hope the quirkiness of it doesn't detract
from the overall approach.)

> As far as I remember the FS and GS segment registers are used by most
> modern operating systems on x86 and x86_64 platforms nowadays to
> distinguish threads. TLS is implemented with FS and GS registers. I
> guess the __read[gf]sdword() intrinsics do exactly the same.

Yup, in fact, if I hadn't come up with the __read[gf]sword() trick,
my only other option would have been TLS (or the GetCurrentThreadId
/pthread_self() approach in the presentation).  TLS is fantastic,
and it's definitely an intrinsic part of the solution (the "Y" part
of "if we're a parallel thread, do Y"), but it definitely more
costly than a simple FS/GS register read.

> Reading
> registers is super fast and should have a negligible effect on code.

Yeah the actual instruction is practically free; the main thing you
pay for is the extra branch.  However, most of the code looks like
this:

if (Py_PXCTX)
something_small_and_inlineable();
else
Py_INCREF(op); /* also small and inlineable */

In the majority of the cases, all the code for both branches is
going to be in the same cache line, so a mispredicted branch is
only going to result in a pipeline stall, which is better than a
cache miss.

> ARM CPUs don't have segment registers because they have a simpler
> addressing model. The register CP15 came up after a couple of Google
> searches.

Noted, thanks!

> IMHO you should target x86, x86_64, ARMv6 and ARMv7. ARMv7 is going to
> be more important than x86 in the future. We are going to see more ARM
> based servers.

Yeah that's my general sentiment too.  I'm definitely curious to see
if other ISAs offer similar facilities (Sparc, IA64, POWER etc), but
the hierarchy will be x86/x64 > ARM > * for the foreseeable future.

Porting the Py_PXCTX part is trivial compared to the work that is
going to be required to get this stuff working on POSIX where none
of the sublime Windows concurrency, synchronisation and async IO
primitives exist.

> Christian

Trent.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
On Wed, Mar 13, 2013 at 07:05:41PM -0700, Trent Nelson wrote:
> Just posted the slides for those that didn't have the benefit of
> attending the language summit today:
> 
> 
> https://speakerdeck.com/trent/parallelizing-the-python-interpreter-an-alternate-approach-to-async

Someone on /r/python asked if I could elaborate on the "do Y" part
of "if we're in a parallel thread, do Y, if not, do X", which I
(inadvertently) ended up replying to in detail.  I've included the
response below.  (I'll work on converting this into a TL;DR set of
slides soon.)

> Can you go into a bit of depth about "X" here?

That's a huge topic that I'm hoping to tackle ASAP.  The basic premise
is that parallel 'Context' objects (well, structs) are allocated for
each parallel thread callback.  The context persists for the lifetime of
the "parallel work".

The "lifetime of the parallel work" depends on what you're doing.  For a
simple ``async.submit_work(foo)``, the context is considered complete
once ``foo()`` has been called (presuming no exceptions were raised).

For an async client/server, the context will persist for the entirety of
the connection.

The context is responsible for encapsulating all resources related to
the parallel thread.  So, it has its own heap, and all memory
allocations are taken from that heap.

For any given parallel thread, only one context can be executing at a
time, and this can be accessed via the ``__declspec(thread) Context
*ctx`` global (which is primed by some glue code as soon as the parallel
thread starts executing a callback).

No reference counting or garbage collection is done during parallel
thread execution.  Instead, once the context is finished, it is
scheduled to be released, which means it'll be "processed" by the main
thread as part of its housekeeping work (during ``async.run()``
(technically, ``async.run_once()``).

The main thread simply destroys the entire heap in one fell swoop,
releasing all memory that was associated with that context.

There are a few side effects to this.  First, the heap allocator
(basically, the thing that answers ``malloc()`` calls) is incredibly
simple.  It allocates LARGE_PAGE_SIZE chunks of memory at a time (2MB on
x64), and simply returns pointers to that chunk for each memory request
(adjusting h->next and allocation stats as it goes along, obviously).
Once the 2MB has been exhausted, another 2MB is allocated.

That approach is fine for the ``submit_(work|timer|wait)`` callbacks,
which basically provide a way to run a presumably-finite-length function
in a parallel thread (and invoking callbacks/errbacks as required).

However, it breaks down when dealing with client/server stuff.  Each
invocation of a callback (say, ``data_received(...)``) may only consume,
say, 500 bytes, but it might be called a million times before the
connection is terminated.  You can't have cumulative memory usage with
possibly-infinite-length client/server-callbacks like you can with the
once-off ``submit_(work|wait|timer)`` stuff.

So, enter heap snapshots.  The logic that handles all client/server
connections is instrumented such that it takes a snapshot of the heap
(and all associated stats) prior to invoking a Python method (via
``PyObject_Call()``, for example, i.e. the invocation of
``data_received``).

When the method completes, we can simply roll back the snapshot.  The
heap's stats and next pointers et al all get reset back to what they
were before the callback was invoked.

That's how the chargen server is able to pump out endless streams of
data for every client whilst keeping memory usage static.  (Well, every
new client currently consumes at least a minimum of 2MB (but down the
track that can be tweaked back down to SMALL_PAGE_SIZE, 4096, for
servers that need to handle hundreds of thousands of clients
simultaneously).

The only issue with this approach is detecting when the callback has
done the unthinkable (from a shared-nothing perspective) and persisted
some random object it created outside of the parallel context it was
created in.

That's actually a huge separate technical issue to tackle -- and it
applies just as much to the normal ``submit_(wait|work|timer)``
callbacks as well.  I've got a somewhat-temporary solution in place for
that currently:

d = async.dict()
def foo():
# async.rdtsc() is a helper method
# that basically wraps the result of
# the assembly RDTSC (read time-
# stamp counter) instruction into a
# PyLong object.  So, it's handy when
# I need to test the very functionality
# being demonstrated here (creating
# an object within a parallel context
# and persisting it elsewhere).
d['foo'] = async.rdtsc()

def bar():
d['bar'] = async.rdtsc()

async.submit_work(foo)
async.submit_work(bar)

That'll result in two contexts being created, one for each callback
invocation.  ``async.dict()`` is a "parallel safe" 

Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Stefan Ring
> Yup, in fact, if I hadn't come up with the __read[gf]sword() trick,
> my only other option would have been TLS (or the GetCurrentThreadId
> /pthread_self() approach in the presentation).  TLS is fantastic,
> and it's definitely an intrinsic part of the solution (the "Y" part
> of "if we're a parallel thread, do Y"), but it definitely more
> costly than a simple FS/GS register read.

I think you should be able to just take the address of a static
__thread variable to achieve the same thing in a more portable way.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
Cross-referenced to relevant bits of code where appropriate.

(And just a quick reminder regarding the code quality disclaimer:
I've been hacking away on this stuff relentlessly for a few months;
the aim has been to make continual forward progress without getting
bogged down in non-value-add busy work.  Lots of wildly inconsistent
naming conventions and dead code that'll be cleaned up down the
track.  And the relevance of any given struct will tend to be
proportional to how many unused members it has (homeless hoarder +
shopping cart analogy).)

On Thu, Mar 14, 2013 at 11:45:20AM -0700, Trent Nelson wrote:
> The basic premise is that parallel 'Context' objects (well, structs)
> are allocated for each parallel thread callback.

The 'Context' struct:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel_private.h#l546

Allocated via new_context():

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l4211

also relevant, new_context_for_socket() (encapsulates a
client/server instance within a context).

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l4300

Primary role of the context is to isolate the memory management.
This is achieved via 'Heap':

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel_private.h#l281

(Which I sort of half started refactoring to use the _HEAD_EXTRA
 approach when I thought I'd need to have a separate heap type for
 some TLS avenue I explored -- turns out that wasn't necessary).

> The context persists for the lifetime of the "parallel work".
> 
> The "lifetime of the parallel work" depends on what you're doing.  For
> a simple ``async.submit_work(foo)``, the context is considered
> complete once ``foo()`` has been called (presuming no exceptions were
> raised).

Managing context lifetime is one of the main responsibilities of
async.run_once():

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3841

> For an async client/server, the context will persist for the entirety
> of the connection.

Marking a socket context as 'finished' for servers is the job of
PxServerSocket_ClientClosed():

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l6885

> The context is responsible for encapsulating all resources related to
> the parallel thread.  So, it has its own heap, and all memory
> allocations are taken from that heap.

The heap is initialized in two steps during new_context().  First,
a handle is allocated for the underlying system heap (via
HeapCreate):

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l4224

The first "heap" is then initialized for use with our context via
the Heap_Init(Context *c, size_t n, int page_size) call:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l1921

Heaps are actually linked together via a doubly-linked list.  The
first heap is a value member (not a pointer) of Context; however,
the active heap is always accessed via the '*h' pointer which is
updated as necessary.

struct Heap {
Heap *prev;
Heap *next;
void *base;
void *next;
int   allocated;
int   remaining;
...

struct Context {
Heap  heap;
Heap *h;
...

> For any given parallel thread, only one context can be executing at a
> time, and this can be accessed via the ``__declspec(thread) Context
> *ctx`` global (which is primed by some glue code as soon as the
> parallel thread starts executing a callback).

Glue entry point for all callbacks is _PyParallel_EnteredCallback:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3047

On the topic of callbacks, the main workhorse for the
submit_(wait|work) callbacks is _PyParallel_WorkCallback:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3120

The interesting logic starts at start:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3251

The interesting part is the error handling.  If the callback raises
an exception, we check to see if an errback has been provided.  If
so, we call the errback with the error details.

If the callback completes successfully (or it fails, but the errback
completes successfully), that is treated as successful callback or
errback completion, respectively:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3270

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3294

If the errback fails, or no errback was provided, the exception
percolates back to the main thread.  This is handled at error:

http://hg.python.org/sandbox/trent/file/7148209d5490/Python/pyparallel.c#l3300

This should make the behavior of async

Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
On Thu, Mar 14, 2013 at 02:30:14PM -0700, Trent Nelson wrote:
> Then it dawned on me to just add the snapshot/rollback stuff to
> normal Context objects.  In retrospect, it's silly I didn't think of
> this in the first place -- the biggest advantage of the Context
> abstraction is that it's thread-local, but not bindingly so (as in,
> it'll only ever run on one thread at a time, but it doesn't matter
> which one, which is essential, because the ).
> 
> Once I switched ...

$10 if you can guess when I took a break for lunch.

"but it doesn't matter which one, which is essential, because
there are no guarantees with regards to which thread runs which
context."

Is along the lines of what I was going to say.

Trent.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] About issue 6560

2013-03-14 Thread Ani Sinha
Hi :

I was looking into a mechanism to get the aux fields from recvmsg() in
python and I came across this issue. Looks like this feature was added
in python 3.3. Is there any reason why this feature was not added for
python 2.7? I am now trying to backport the patch to python 2.7.

any insight into this would be appreciated.

Thanks
ani
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
On Thu, Mar 14, 2013 at 12:59:57PM -0700, Stefan Ring wrote:
> > Yup, in fact, if I hadn't come up with the __read[gf]sword() trick,
> > my only other option would have been TLS (or the GetCurrentThreadId
> > /pthread_self() approach in the presentation).  TLS is fantastic,
> > and it's definitely an intrinsic part of the solution (the "Y" part
> > of "if we're a parallel thread, do Y"), but it definitely more
> > costly than a simple FS/GS register read.
> 
> I think you should be able to just take the address of a static
> __thread variable to achieve the same thing in a more portable way.

Sure, but, uh, that's kinda' trivial in comparison to all the wildly
unportable Windows-only functionality I'm using to achieve all of
this at the moment :-)

For the record, here are all the Windows calls I'm using that have
no *direct* POSIX equivalent:

Interlocked singly-linked lists:
- InitializeSListHead()
- InterlockedFlushSList()
- QueryDepthSList()
- InterlockedPushEntrySList()
- InterlockedPushListSList()
- InterlockedPopEntrySlist()

Synchronisation and concurrency primitives:
- Critical sections
- InitializeCriticalSectionAndSpinCount()
- EnterCriticalSection()
- LeaveCriticalSection()
- TryEnterCriticalSection()
- Slim read/writer locks (some pthread implements have
  rwlocks)*:
- InitializeSRWLock()
- AcquireSRWLockShared()
- AcquireSRWLockExclusive()
- ReleaseSRWLockShared()
- ReleaseSRWLockExclusive()
- TryAcquireSRWLockExclusive()
- TryAcquireSRWLockShared()
- One-time initialization:
- InitOnceBeginInitialize()
- InitOnceComplete()
- Generic event, signalling and wait facilities:
- CreateEvent()
- SetEvent()
- WaitForSingleObject()
- WaitForMultipleObjects()
- SignalObjectAndWait()

Native thread pool facilities:
- TrySubmitThreadpoolCallback()
- StartThreadpoolIo()
- CloseThreadpoolIo()
- CancelThreadpoolIo()
- DisassociateCurrentThreadFromCallback()
- CallbackMayRunLong()
- CreateThreadpoolWait()
- SetThreadpoolWait()

Memory management:
- HeapCreate()
- HeapAlloc()
- HeapDestroy()

Structured Exception Handling (#ifdef Py_DEBUG):
- __try/__except

Sockets:
- ConnectEx()
- AcceptEx()
- WSAEventSelect(FD_ACCEPT)
- DisconnectEx(TF_REUSE_SOCKET)
- Overlapped WSASend()
- Overlapped WSARecv()


Don't get me wrong, I grew up with UNIX and love it as much as the
next guy, but you can't deny the usefulness of Windows' facilities
for writing high-performance, multi-threaded IO code.  It's decades
ahead of POSIX.  (Which is also why it bugs me when I see select()
being used on Windows, or IOCP being used as if it were a poll-type
"generic IO multiplexor" -- that's like having a Ferrari and speed
limiting it to 5mph!)

So, before any of this has a chance of working on Linux/BSD, a lot
more scaffolding will need to be written to provide the things we
get for free on Windows (threadpools being the biggest freebie).


Trent.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Martin v. Löwis

Am 14.03.13 11:23, schrieb Trent Nelson:

ARM CPUs don't have segment registers because they have a simpler
addressing model. The register CP15 came up after a couple of Google
searches.


 Noted, thanks!

 Yeah that's my general sentiment too.  I'm definitely curious to see
 if other ISAs offer similar facilities (Sparc, IA64, POWER etc), but
 the hierarchy will be x86/x64 > ARM > * for the foreseeable future.


Most (in particular the RISC ones) do have a general-purpose register 
reserved for TLS. For ARM, the interesting thing is that CP15 apparently

is not available on all ARM implementations, and Linux then emulates it
on processors that don't have it (by handling the trap), which is
costly. Additionally, it appears that Android fails to provide that 
emulation (in some versions, on some processors), so that seems to be

tricky ground.


 Porting the Py_PXCTX part is trivial compared to the work that is
 going to be required to get this stuff working on POSIX where none
 of the sublime Windows concurrency, synchronisation and async IO
 primitives exist.


I couldn't understand from your presentation why this is essential
to your approach. IIUC, you are "just" relying on the OS providing
a thread pool, (and the sublime concurrency and synchronization
routines are nothing more than that, ISTM). Implementing a thread
pool on top of select/poll/kqueue seems straight-forward.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] About issue 6560

2013-03-14 Thread Martin v. Löwis

Am 14.03.13 15:15, schrieb Ani Sinha:

I was looking into a mechanism to get the aux fields from recvmsg() in
python and I came across this issue. Looks like this feature was added
in python 3.3. Is there any reason why this feature was not added for
python 2.7?


Most certainly: Python 2.7 (and thus Python 2) is feature-frozen; no
new features can be added to it. People wanting new features need to
port to Python 3.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
On Thu, Mar 14, 2013 at 03:50:27PM -0700, "Martin v. Löwis" wrote:
> Am 14.03.13 12:59, schrieb Stefan Ring:
> > I think you should be able to just take the address of a static
> > __thread variable to achieve the same thing in a more portable way.
> 
> That assumes that the compiler supports __thread variables, which
> isn't that portable in the first place.

FWIW, I make extensive use of __declspec(thread).  I'm aware of GCC
and Clang's __thread alternative.  No idea what IBM xlC, Sun Studio
and others offer, if anything.

Trent.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Martin v. Löwis

Am 14.03.13 12:59, schrieb Stefan Ring:

I think you should be able to just take the address of a static
__thread variable to achieve the same thing in a more portable way.


That assumes that the compiler supports __thread variables, which
isn't that portable in the first place.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slides from today's parallel/async Python talk

2013-03-14 Thread Trent Nelson
On Thu, Mar 14, 2013 at 03:56:33PM -0700, "Martin v. Löwis" wrote:
> Am 14.03.13 11:23, schrieb Trent Nelson:
> >  Porting the Py_PXCTX part is trivial compared to the work that is
> >  going to be required to get this stuff working on POSIX where none
> >  of the sublime Windows concurrency, synchronisation and async IO
> >  primitives exist.
> 
> I couldn't understand from your presentation why this is essential
> to your approach. IIUC, you are "just" relying on the OS providing
> a thread pool, (and the sublime concurrency and synchronization
> routines are nothing more than that, ISTM).

Right, there's nothing Windows* does that can't be achieved on
Linux/BSD, it'll just take more scaffolding (i.e. we'll need to
manage our own thread pool at the very least).

[*]: actually, the interlocked singly-linked list stuff concerns me;
the API seems straightforward enough but the implementation becomes
deceptively complex once you factor in the ABA problem.  (I'm not
aware of a portable open source alternative for that stuff.)

> Implementing a thread pool on top of select/poll/kqueue seems
> straight-forward.

Nod, that's exactly what I've got in mind.  Spin up a bunch of
threads that sit there and call poll/kqueue in an endless loop.
That'll work just fine for Linux/BSD/OSX.

Actually, what's really interesting is the new registered IO
facilities in Windows 8/2012.  The Microsoft recommendation for
achieving the ultimate performance (least amount of jitter, lowest
latency, highest throughput) is to do something like this:

while (1) {

if (!DequeueCompletionRequests(...)) {
YieldProcessor();
continue;
} else {
/* Handle requests */
}
}

That pattern looks a lot more like what you'd do on Linux/BSD (spin
up a thread per CPU and call epoll/kqueue endlessly) than any of the
previous Windows IO patterns.


Trent.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] About issue 6560

2013-03-14 Thread Terry Reedy

On 3/14/2013 6:48 PM, "Martin v. Löwis" wrote:

Am 14.03.13 15:15, schrieb Ani Sinha:

I was looking into a mechanism to get the aux fields from recvmsg() in
python and I came across this issue. Looks like this feature was added
in python 3.3. Is there any reason why this feature was not added for
python 2.7?


Most certainly: Python 2.7 (and thus Python 2) is feature-frozen; no


As are 3.2 and now 3.3. Every version is feature frozen when released. 
Bugfix releases only contain bugfixes.



new features can be added to it. People wanting new features need to
port to Python 3.


In particular 3.3.

--
Terry Jan Reedy


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Matching __all__ to doc: bugfix or enhancement?

2013-03-14 Thread Terry Reedy

The timeit doc describes four public attributes.
The current timeit.__all__ only lists one.
http://bugs.python.org/issue17414
proposes to expand __all__ to include all four:
-__all__ = ["Timer"]
+__all__ = ["Timer", "timeit", "repeat", "default_timer"]

The effect of the change is
a) help(timit) will mention the three functions as well as the class;
b) IDLE's attribute completion box* will list all four instead just Timer;
c) unknow other users of .__all__ will see the expanded list, for better 
or worse.


* Typing 'xxx.' and either waiting or typing cntl-space brings up a 
listbox of attributes to select from.


Is the code change an all-version bugfix or a default-only enhancement?
I can see it both ways, but a decision is required to act.

PS: I think the devguide should gain a new 'Behavior versus Enhancement' 
section after the current "11.1.2. Type" to clarify issues like this.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Matching __all__ to doc: bugfix or enhancement?

2013-03-14 Thread Fred Drake
On Thu, Mar 14, 2013 at 9:33 PM, Terry Reedy  wrote:
> Is the code change an all-version bugfix or a default-only enhancement?
> I can see it both ways, but a decision is required to act.

This is actually backward-incompatible, so should not be considered a
simple bugfix.  If determined to be desirable, it should not be applied
to any version before 3.4.


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Matching __all__ to doc: bugfix or enhancement?

2013-03-14 Thread Eli Bendersky
On Thu, Mar 14, 2013 at 6:33 PM, Terry Reedy  wrote:

> The timeit doc describes four public attributes.
> The current timeit.__all__ only lists one.
> http://bugs.python.org/**issue17414 
> proposes to expand __all__ to include all four:
> -__all__ = ["Timer"]
> +__all__ = ["Timer", "timeit", "repeat", "default_timer"]
>
> The effect of the change is
> a) help(timit) will mention the three functions as well as the class;
> b) IDLE's attribute completion box* will list all four instead just Timer;
> c) unknow other users of .__all__ will see the expanded list, for better
> or worse.
>
>
Another effect is that existing code that does:

from timeit import *

May break. The above may not be the recommended best practice in Python,
but it's perfectly valid and widely used.

Eli
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Matching __all__ to doc: bugfix or enhancement?

2013-03-14 Thread Guido van Rossum
So it's a new feature, albeit a small one. I do see that it shouldn't
be backported, but I don't see any worries about doing it in 3.4.
Adding new functions/classes/constants to modules happens all the
time, and we never give a second thought to users of import *. :-)

On Thu, Mar 14, 2013 at 6:54 PM, Eli Bendersky  wrote:
>
>
>
> On Thu, Mar 14, 2013 at 6:33 PM, Terry Reedy  wrote:
>>
>> The timeit doc describes four public attributes.
>> The current timeit.__all__ only lists one.
>> http://bugs.python.org/issue17414
>> proposes to expand __all__ to include all four:
>> -__all__ = ["Timer"]
>> +__all__ = ["Timer", "timeit", "repeat", "default_timer"]
>>
>> The effect of the change is
>> a) help(timit) will mention the three functions as well as the class;
>> b) IDLE's attribute completion box* will list all four instead just Timer;
>> c) unknow other users of .__all__ will see the expanded list, for better
>> or worse.
>>
>
> Another effect is that existing code that does:
>
> from timeit import *
>
> May break. The above may not be the recommended best practice in Python, but
> it's perfectly valid and widely used.
>
> Eli
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



--
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Matching __all__ to doc: bugfix or enhancement?

2013-03-14 Thread Eli Bendersky
On Thu, Mar 14, 2013 at 9:15 PM, Guido van Rossum  wrote:

> So it's a new feature, albeit a small one. I do see that it shouldn't
> be backported, but I don't see any worries about doing it in 3.4.
> Adding new functions/classes/constants to modules happens all the
> time, and we never give a second thought to users of import *. :-)
>
>
Oh yes, I agree there should be no problem in the default branch. My
comment was mainly aimed at backporting it; I should've made it clearer.

Eli



> On Thu, Mar 14, 2013 at 6:54 PM, Eli Bendersky  wrote:
> >
> >
> >
> > On Thu, Mar 14, 2013 at 6:33 PM, Terry Reedy  wrote:
> >>
> >> The timeit doc describes four public attributes.
> >> The current timeit.__all__ only lists one.
> >> http://bugs.python.org/issue17414
> >> proposes to expand __all__ to include all four:
> >> -__all__ = ["Timer"]
> >> +__all__ = ["Timer", "timeit", "repeat", "default_timer"]
> >>
> >> The effect of the change is
> >> a) help(timit) will mention the three functions as well as the class;
> >> b) IDLE's attribute completion box* will list all four instead just
> Timer;
> >> c) unknow other users of .__all__ will see the expanded list, for better
> >> or worse.
> >>
> >
> > Another effect is that existing code that does:
> >
> > from timeit import *
> >
> > May break. The above may not be the recommended best practice in Python,
> but
> > it's perfectly valid and widely used.
> >
> > Eli
> >
> >
> > ___
> > Python-Dev mailing list
> > [email protected]
> > http://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> > http://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Followup - Re: Bad python 2.5 build on OSX 10.8 mountain lion

2013-03-14 Thread Ned Deily
Way back on 2012-10-05 23:45:11 GMT in article 
, I wrote:

> In article ,
>  Ned Deily  wrote:
> > In article <[email protected]>,
> >  Stefan Krah  wrote:
> > > Ned Deily  wrote:
> > > > > Forgot the link...
> > > > > http://code.google.com/p/googleappengine/issues/detail?id=7885
> > > > > On Monday, October 1, 2012, Guido van Rossum wrote:
> > > > > > As discussed here, the python 2.5 binary distributed by Apple on 
> > > > > > mountain
> > > > > > lion is broken. Could someone file an official complaint?
> > > > I've filed a bug against 10.8 python2.5.   The 10.8 versions of Apple's 
> > > > pythons are compile with clang and we did see some sign extension 
> > > > issues 
> > > > with ctypes.  The 10.7 version of Apple's python2.5 is compiled with 
> > > > llvm-gcc and handles 2**31 correctly.
> > > Yes, this looks like http://bugs.python.org/issue11149 . 
> > Ah, right, thanks.  I've updated the Apple issue accordingly.
> 
> Update: the bug I filed has been closed as a duplicate of #11932488 
> which apparently at the moment is still open.  No other information is 
> available.

FYI, today Apple finally released OS X 10.8.3, the next maintenance 
release of Mountain Lion, and it does include a recompiled version of 
Python 2.5.6 that appears to solve the sign-extension problem:
2**31-1 is now 2147483647L.

-- 
 Ned Deily,
 [email protected]

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com