Re: [python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread cr0hn
Thank you for your responses.

The scenario (I forgot in my first post): I'm trying to improve I/O accesses 
(disk/network...).

So, if a Python thread map with a OS 1:1 thread, and the main problem (I 
understood that) is the cost of context switching between of 
threads/coroutines... this raises me a new question:

If I only run a process with 1 thread (the default state) the GIL will change 
the context after the thread ticks was spent? Or the behavior is like a plain 
run until the program ends?

Thinking about that, I suppose that if the status is 1 process <-> 1 thread, 
without context change, obviously the best approach for high performance 
network I/O are with creating coroutines and not threads, right?

I'm wrong?


En 19 de abril de 2016 en 0:54:28, Guido van Rossum (gu...@python.org) escrito:

On Mon, Apr 18, 2016 at 1:26 PM, Imran Geriskovan  
wrote:
A) Python threads are not real threads. It multiplexes "Python Threads"
on a single OS thread. (Guido, can you correct me if I'm wrong,
and can you provide some info on multiplexing/context switching of
"Python Threads"?)

Sorry, you are wrong. Python threads map 1:1 to OS threads. They are as real as 
threads come (the GIL notwithstanding).

--
--Guido van Rossum (python.org/~guido)
---
Daniel García (cr0hn)
Security researcher and ethical hacker

Personal site: http://cr0hn.com
Linkedin: https://www.linkedin.com/in/garciagarciadaniel 
Company: http://abirtone.com 
Twitter: @ggdaniel 

signature.asc
Description: Message signed with OpenPGP using AMPGpg


Re: [python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread Guido van Rossum
On Mon, Apr 18, 2016 at 1:26 PM, Imran Geriskovan <
imran.gerisko...@gmail.com> wrote:

> A) Python threads are not real threads. It multiplexes "Python Threads"
> on a single OS thread. (Guido, can you correct me if I'm wrong,
> and can you provide some info on multiplexing/context switching of
> "Python Threads"?)
>

Sorry, you are wrong. Python threads map 1:1 to OS threads. They are as
real as threads come (the GIL notwithstanding).

-- 
--Guido van Rossum (python.org/~guido)


Re: [python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread Imran Geriskovan
>>> I don't think you need the threads.
>>> 1. If your tasks are I/O bound, coroutines are a safer way to do things,
>>> and probably even have better performance;
>>
>> Thread vs Coroutine context switching is an interesting topic.
>> Do you have any data for comparison?

> My 2cts:
> OS native (= non-green) threads are an OS scheduler driven, preemptive
> multitasking approach, necessarily with context switching overhead that
> is higher than a cooperative multitasking approach like asyncio event loop.
> Note: that is Twisted, not asyncio, but the latter should behave the
> same qualitatively.
> /Tobias

Linux OS threads come with 8MB stack per thread + switching
costs as you mentioned.

A) Python threads are not real threads. It multiplexes "Python Threads"
on a single OS thread. (Guido, can you correct me if I'm wrong,
and can you provide some info on multiplexing/context switching of
"Python Threads"?)

B) Where as asyncio multiplexes coroutines on a "Python Thread"?

The question is "Which one is more effective?". The answer is
ofcourse dependent on use case.

However, as a heavy user of coroutines, I begin to think to go back to
"Python Threads".. Anyway that's personal choice.

Now lets clarify advantages and disadvantages between A and B..

Regards,
Imran


Re: [python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread Tobias Oberstein

Am 18.04.2016 um 21:33 schrieb Imran Geriskovan:

On 4/18/16, Gustavo Carneiro  wrote:

I don't think you need the threads.
1. If your tasks are I/O bound, coroutines are a safer way to do things,
and probably even have better performance;


Thread vs Coroutine context switching is an interesting topic.
Do you have any data for comparison?


My 2cts:

OS native (= non-green) threads are an OS scheduler driven, preemptive 
multitasking approach, necessarily with context switching overhead that 
is higher than a cooperative multitasking approach like asyncio event loop.


Eg the context switching with threads involves saving and restoring the 
whole CPU core register set. OS native threads also involves bounding 
back and forth between kernel- and userspace.


Practical evidence: name one high performance network server that is 
using threads (and only threads), and not some event loop thing;)


You want N threads/processes where N is related to number of cores 
and/or effective IO concurrency _and_ each thread/process run an event 
loop thing. And because of the GIL, you want processes, not threads on 
(C)Python.


The effective IO concurrency depends on the number of IO queues your 
hardware supports (the NICs or the storage devices). The IO queues 
should have affinity to the (nearest) CPU core on an SMP system also.


For network, I once did some experiments of how far Python can go. Here 
is Python (PyPy) doing 630k HTTP requests/sec (12.6 GB/sec) using 40 cores:


https://github.com/crossbario/crossbarexamples/tree/master/benchmark/web

Note: that is Twisted, not asyncio, but the latter should behave the 
same qualitatively.


Cheers,
/Tobias



Regards,
Imran





Re: [python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread Imran Geriskovan
On 4/18/16, Gustavo Carneiro  wrote:
> I don't think you need the threads.
> 1. If your tasks are I/O bound, coroutines are a safer way to do things,
> and probably even have better performance;

Thread vs Coroutine context switching is an interesting topic.
Do you have any data for comparison?

Regards,
Imran


Re: [python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread Gustavo Carneiro
I don't think you need the threads.

1. If your tasks are I/O bound, coroutines are a safer way to do things,
and probably even have better performance;

2. If your tasks are CPU bound, only multiple processes will help, multiple
(Python) threads do not help at all.  Only in the special case where the
CPU work is mostly done via a C library[*] do threads help.

I would recommend using multiple threads only if interacting with 3rd party
code that is I/O bound but is not written with an asynchronous API, such as
the requests library, selenium, etc.  But in this case, probably using
asyncio.Loop.run_in_executor() is a simpler solution.

[*] and a C API wrapped in such a way that it does a lot of work with few
Python calls, plus it releases the GIL, so don't go thinking that a simple
scalar math function call can take advantage of multithreading.


On 18 April 2016 at 19:33, cr0hn cr0hn  wrote:

> Hi all,
>
> It's the first time I write in this list. Sorry if it's not the best place
> for this question.
>
> After I read the Asyncio's documentation, PEPs, Guido/Jesse/David Beazley
> articles/talks, etc, I developed a PoC library that mixes: Process +
> Threads + Asyncio Tasks, doing an scheme like this diagram:
>
>  main -> Process 1 -> Thread 1.1 -> Task 1.1.1
>  -> Task 1.1.2
>  -> Task 1.1.3
>
>-> Thread 1.2
>  -> Task 1.2.1
>  -> Task 1.2.2
>  -> Task 1.2.3
>
> Process 2 -> Thread 2.1 -> Task 2.1.1
> -> Task 2.1.2
> -> Task 2.1.3
>
>   -> Thread 2.2
> -> Task 2.2.1
> -> Task 2.2.2
> -> Task 2.2.3
>
> In my local tests, this approach appear to improve (and simplify) the
> concurrency/parallelism for some tasks but, before release the library at
> github, I don't know if my aproach is wrong and I would appreciate your
> opinion.
>
> Thank you very much for your time.
>
> Regards!
>



-- 
Gustavo J. A. M. Carneiro
Gambit Research
"The universe is always one step beyond logic." -- Frank Herbert


[python-tulip] Process + Threads + asyncio... has sense?

2016-04-18 Thread cr0hn cr0hn
Hi all,

It's the first time I write in this list. Sorry if it's not the best place 
for this question.

After I read the Asyncio's documentation, PEPs, Guido/Jesse/David Beazley 
articles/talks, etc, I developed a PoC library that mixes: Process + 
Threads + Asyncio Tasks, doing an scheme like this diagram:

 main -> Process 1 -> Thread 1.1 -> Task 1.1.1
 -> Task 1.1.2
 -> Task 1.1.3

   -> Thread 1.2
 -> Task 1.2.1
 -> Task 1.2.2
 -> Task 1.2.3

Process 2 -> Thread 2.1 -> Task 2.1.1
-> Task 2.1.2
-> Task 2.1.3

  -> Thread 2.2
-> Task 2.2.1
-> Task 2.2.2
-> Task 2.2.3

In my local tests, this approach appear to improve (and simplify) the 
concurrency/parallelism for some tasks but, before release the library at 
github, I don't know if my aproach is wrong and I would appreciate your 
opinion.

Thank you very much for your time.

Regards!


Re: [python-tulip] A mix of zip and select to iterate over multiple asynchronous iterator ?

2016-04-18 Thread Guido van Rossum
What people typically do when handling multiple events is to have separate
event handlers for each event type. You can do this using callbacks
(typically by using the Protocol/Transport convention), or you can have
separate loops that each use `await`, `async for` (or some 3.4-compatible
alternative spelling) to process the event streams. The different callbacks
or loops then share state via a shared object. The asyncio event loop
guarantees that the other callback doesn't run until the first callback
returns; with `await` it guarantees that it won't switch coroutines between
`await` calls.

On Mon, Apr 18, 2016 at 5:59 AM, Julien Palard 
wrote:

> Hi there,
>
> For a pet projet of mine (https://github.com/julienpalard/theodoreserver)
> I encontered a problem:
>
> I was having two asynchronous iterables (easily iterables via a simple
> "async for ..."), without "theodore" knowledge you can simply imagine
> listening for events from different sources.
>
> So my need is what is solved by "select", "poll", "epoll", "kqueue",
> "libevent" on sockets, I need to listen from multiple sources and be
> notified when one has data.
>
> So I wrote an "zip":
> https://github.com/JulienPalard/TheodoreServer/blob/master/asynczip.py
> which two modes, one named "SOON", behaving more like select, and one
> "PAIRED" behaving more like "zip".
>
> Given n asynchronous iterables, you can iterate over them using :
>
>
>  async for resuts in asynczip.AsyncZip(*iterables) it's really like a 
> "asyncio.wait"
> but for an iterable, with `wait`'s flag `FIRST_COMPLETED` being my "SOON"
> flag, and `wait` flag "ALL_COMPLETED" being my "PAIRED" flag (should I
> rename mines ?) Question is: Am I walking the wrong way and is there a
> simple way to do it I completly missed ? Or is my tool really usefull and
> should be shared ? I'm too young in asyncio to juge that, so I'll listen to
> you ! Bests,
>
> --
> Julien Palard
>



-- 
--Guido van Rossum (python.org/~guido)


[python-tulip] aiomas 1.0.0 released

2016-04-18 Thread Stefan Scherfke
Hi all,

I just release version 1.0.0 of aiomas (https://aiomas.readthedocs.org/)
– a library for networking, RPC and multi-agent systems based on
asyncio.

Its basic set of features:

- Three layers of abstraction around raw TCP / Unix domain sockets:

  - Request-reply channels

  - Remote-procedure calls (RPC)

  - Agents and containers

- TLS support for authorization and encrypted communication.

- Interchangeable and extensible codecs: JSON and MsgPack (the latter
  optionally compressed with Blosc) are built-in.  You can add custom
  codecs or write (de)serializers for your own objects to extend
  a codec.

- Deterministic, emulated sockets: A LocalQueue transport lets you send
  and receive message in a deterministic and reproducible order within
  a single process. This helps testing and debugging distributed
  algorithms.

The package is released under the MIT license.  It requires Python 3.4
and above and runs on Linux, OS X, and Windows.

Cheers,
Stefan



[python-tulip] A mix of zip and select to iterate over multiple asynchronous iterator ?

2016-04-18 Thread Julien Palard

>
> Hi there,

For a pet projet of mine (https://github.com/julienpalard/theodoreserver) I 
encontered a problem:

I was having two asynchronous iterables (easily iterables via a simple 
"async for ..."), without "theodore" knowledge you can simply imagine 
listening for events from different sources.

So my need is what is solved by "select", "poll", "epoll", "kqueue", 
"libevent" on sockets, I need to listen from multiple sources and be 
notified when one has data.

So I wrote an 
"zip": https://github.com/JulienPalard/TheodoreServer/blob/master/asynczip.py 
which two modes, one named "SOON", behaving more like select, and one 
"PAIRED" behaving more like "zip".

Given n asynchronous iterables, you can iterate over them using :


 async for resuts in asynczip.AsyncZip(*iterables) it's really like a 
"asyncio.wait" 
but for an iterable, with `wait`'s flag `FIRST_COMPLETED` being my "SOON" 
flag, and `wait` flag "ALL_COMPLETED" being my "PAIRED" flag (should I 
rename mines ?) Question is: Am I walking the wrong way and is there a 
simple way to do it I completly missed ? Or is my tool really usefull and 
should be shared ? I'm too young in asyncio to juge that, so I'll listen to 
you ! Bests,

-- 
Julien Palard