Re: [python-tulip] Process + Threads + asyncio... has sense?
Thank you for your responses. The scenario (I forgot in my first post): I'm trying to improve I/O accesses (disk/network...). So, if a Python thread map with a OS 1:1 thread, and the main problem (I understood that) is the cost of context switching between of threads/coroutines... this raises me a new question: If I only run a process with 1 thread (the default state) the GIL will change the context after the thread ticks was spent? Or the behavior is like a plain run until the program ends? Thinking about that, I suppose that if the status is 1 process <-> 1 thread, without context change, obviously the best approach for high performance network I/O are with creating coroutines and not threads, right? I'm wrong? En 19 de abril de 2016 en 0:54:28, Guido van Rossum (gu...@python.org) escrito: On Mon, Apr 18, 2016 at 1:26 PM, Imran Geriskovanwrote: A) Python threads are not real threads. It multiplexes "Python Threads" on a single OS thread. (Guido, can you correct me if I'm wrong, and can you provide some info on multiplexing/context switching of "Python Threads"?) Sorry, you are wrong. Python threads map 1:1 to OS threads. They are as real as threads come (the GIL notwithstanding). -- --Guido van Rossum (python.org/~guido) --- Daniel García (cr0hn) Security researcher and ethical hacker Personal site: http://cr0hn.com Linkedin: https://www.linkedin.com/in/garciagarciadaniel Company: http://abirtone.com Twitter: @ggdaniel signature.asc Description: Message signed with OpenPGP using AMPGpg
Re: [python-tulip] Process + Threads + asyncio... has sense?
On Mon, Apr 18, 2016 at 1:26 PM, Imran Geriskovan < imran.gerisko...@gmail.com> wrote: > A) Python threads are not real threads. It multiplexes "Python Threads" > on a single OS thread. (Guido, can you correct me if I'm wrong, > and can you provide some info on multiplexing/context switching of > "Python Threads"?) > Sorry, you are wrong. Python threads map 1:1 to OS threads. They are as real as threads come (the GIL notwithstanding). -- --Guido van Rossum (python.org/~guido)
Re: [python-tulip] Process + Threads + asyncio... has sense?
>>> I don't think you need the threads. >>> 1. If your tasks are I/O bound, coroutines are a safer way to do things, >>> and probably even have better performance; >> >> Thread vs Coroutine context switching is an interesting topic. >> Do you have any data for comparison? > My 2cts: > OS native (= non-green) threads are an OS scheduler driven, preemptive > multitasking approach, necessarily with context switching overhead that > is higher than a cooperative multitasking approach like asyncio event loop. > Note: that is Twisted, not asyncio, but the latter should behave the > same qualitatively. > /Tobias Linux OS threads come with 8MB stack per thread + switching costs as you mentioned. A) Python threads are not real threads. It multiplexes "Python Threads" on a single OS thread. (Guido, can you correct me if I'm wrong, and can you provide some info on multiplexing/context switching of "Python Threads"?) B) Where as asyncio multiplexes coroutines on a "Python Thread"? The question is "Which one is more effective?". The answer is ofcourse dependent on use case. However, as a heavy user of coroutines, I begin to think to go back to "Python Threads".. Anyway that's personal choice. Now lets clarify advantages and disadvantages between A and B.. Regards, Imran
Re: [python-tulip] Process + Threads + asyncio... has sense?
Am 18.04.2016 um 21:33 schrieb Imran Geriskovan: On 4/18/16, Gustavo Carneirowrote: I don't think you need the threads. 1. If your tasks are I/O bound, coroutines are a safer way to do things, and probably even have better performance; Thread vs Coroutine context switching is an interesting topic. Do you have any data for comparison? My 2cts: OS native (= non-green) threads are an OS scheduler driven, preemptive multitasking approach, necessarily with context switching overhead that is higher than a cooperative multitasking approach like asyncio event loop. Eg the context switching with threads involves saving and restoring the whole CPU core register set. OS native threads also involves bounding back and forth between kernel- and userspace. Practical evidence: name one high performance network server that is using threads (and only threads), and not some event loop thing;) You want N threads/processes where N is related to number of cores and/or effective IO concurrency _and_ each thread/process run an event loop thing. And because of the GIL, you want processes, not threads on (C)Python. The effective IO concurrency depends on the number of IO queues your hardware supports (the NICs or the storage devices). The IO queues should have affinity to the (nearest) CPU core on an SMP system also. For network, I once did some experiments of how far Python can go. Here is Python (PyPy) doing 630k HTTP requests/sec (12.6 GB/sec) using 40 cores: https://github.com/crossbario/crossbarexamples/tree/master/benchmark/web Note: that is Twisted, not asyncio, but the latter should behave the same qualitatively. Cheers, /Tobias Regards, Imran
Re: [python-tulip] Process + Threads + asyncio... has sense?
On 4/18/16, Gustavo Carneirowrote: > I don't think you need the threads. > 1. If your tasks are I/O bound, coroutines are a safer way to do things, > and probably even have better performance; Thread vs Coroutine context switching is an interesting topic. Do you have any data for comparison? Regards, Imran
Re: [python-tulip] Process + Threads + asyncio... has sense?
I don't think you need the threads. 1. If your tasks are I/O bound, coroutines are a safer way to do things, and probably even have better performance; 2. If your tasks are CPU bound, only multiple processes will help, multiple (Python) threads do not help at all. Only in the special case where the CPU work is mostly done via a C library[*] do threads help. I would recommend using multiple threads only if interacting with 3rd party code that is I/O bound but is not written with an asynchronous API, such as the requests library, selenium, etc. But in this case, probably using asyncio.Loop.run_in_executor() is a simpler solution. [*] and a C API wrapped in such a way that it does a lot of work with few Python calls, plus it releases the GIL, so don't go thinking that a simple scalar math function call can take advantage of multithreading. On 18 April 2016 at 19:33, cr0hn cr0hnwrote: > Hi all, > > It's the first time I write in this list. Sorry if it's not the best place > for this question. > > After I read the Asyncio's documentation, PEPs, Guido/Jesse/David Beazley > articles/talks, etc, I developed a PoC library that mixes: Process + > Threads + Asyncio Tasks, doing an scheme like this diagram: > > main -> Process 1 -> Thread 1.1 -> Task 1.1.1 > -> Task 1.1.2 > -> Task 1.1.3 > >-> Thread 1.2 > -> Task 1.2.1 > -> Task 1.2.2 > -> Task 1.2.3 > > Process 2 -> Thread 2.1 -> Task 2.1.1 > -> Task 2.1.2 > -> Task 2.1.3 > > -> Thread 2.2 > -> Task 2.2.1 > -> Task 2.2.2 > -> Task 2.2.3 > > In my local tests, this approach appear to improve (and simplify) the > concurrency/parallelism for some tasks but, before release the library at > github, I don't know if my aproach is wrong and I would appreciate your > opinion. > > Thank you very much for your time. > > Regards! > -- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
[python-tulip] Process + Threads + asyncio... has sense?
Hi all, It's the first time I write in this list. Sorry if it's not the best place for this question. After I read the Asyncio's documentation, PEPs, Guido/Jesse/David Beazley articles/talks, etc, I developed a PoC library that mixes: Process + Threads + Asyncio Tasks, doing an scheme like this diagram: main -> Process 1 -> Thread 1.1 -> Task 1.1.1 -> Task 1.1.2 -> Task 1.1.3 -> Thread 1.2 -> Task 1.2.1 -> Task 1.2.2 -> Task 1.2.3 Process 2 -> Thread 2.1 -> Task 2.1.1 -> Task 2.1.2 -> Task 2.1.3 -> Thread 2.2 -> Task 2.2.1 -> Task 2.2.2 -> Task 2.2.3 In my local tests, this approach appear to improve (and simplify) the concurrency/parallelism for some tasks but, before release the library at github, I don't know if my aproach is wrong and I would appreciate your opinion. Thank you very much for your time. Regards!
Re: [python-tulip] A mix of zip and select to iterate over multiple asynchronous iterator ?
What people typically do when handling multiple events is to have separate event handlers for each event type. You can do this using callbacks (typically by using the Protocol/Transport convention), or you can have separate loops that each use `await`, `async for` (or some 3.4-compatible alternative spelling) to process the event streams. The different callbacks or loops then share state via a shared object. The asyncio event loop guarantees that the other callback doesn't run until the first callback returns; with `await` it guarantees that it won't switch coroutines between `await` calls. On Mon, Apr 18, 2016 at 5:59 AM, Julien Palardwrote: > Hi there, > > For a pet projet of mine (https://github.com/julienpalard/theodoreserver) > I encontered a problem: > > I was having two asynchronous iterables (easily iterables via a simple > "async for ..."), without "theodore" knowledge you can simply imagine > listening for events from different sources. > > So my need is what is solved by "select", "poll", "epoll", "kqueue", > "libevent" on sockets, I need to listen from multiple sources and be > notified when one has data. > > So I wrote an "zip": > https://github.com/JulienPalard/TheodoreServer/blob/master/asynczip.py > which two modes, one named "SOON", behaving more like select, and one > "PAIRED" behaving more like "zip". > > Given n asynchronous iterables, you can iterate over them using : > > > async for resuts in asynczip.AsyncZip(*iterables) it's really like a > "asyncio.wait" > but for an iterable, with `wait`'s flag `FIRST_COMPLETED` being my "SOON" > flag, and `wait` flag "ALL_COMPLETED" being my "PAIRED" flag (should I > rename mines ?) Question is: Am I walking the wrong way and is there a > simple way to do it I completly missed ? Or is my tool really usefull and > should be shared ? I'm too young in asyncio to juge that, so I'll listen to > you ! Bests, > > -- > Julien Palard > -- --Guido van Rossum (python.org/~guido)
[python-tulip] aiomas 1.0.0 released
Hi all, I just release version 1.0.0 of aiomas (https://aiomas.readthedocs.org/) – a library for networking, RPC and multi-agent systems based on asyncio. Its basic set of features: - Three layers of abstraction around raw TCP / Unix domain sockets: - Request-reply channels - Remote-procedure calls (RPC) - Agents and containers - TLS support for authorization and encrypted communication. - Interchangeable and extensible codecs: JSON and MsgPack (the latter optionally compressed with Blosc) are built-in. You can add custom codecs or write (de)serializers for your own objects to extend a codec. - Deterministic, emulated sockets: A LocalQueue transport lets you send and receive message in a deterministic and reproducible order within a single process. This helps testing and debugging distributed algorithms. The package is released under the MIT license. It requires Python 3.4 and above and runs on Linux, OS X, and Windows. Cheers, Stefan
[python-tulip] A mix of zip and select to iterate over multiple asynchronous iterator ?
> > Hi there, For a pet projet of mine (https://github.com/julienpalard/theodoreserver) I encontered a problem: I was having two asynchronous iterables (easily iterables via a simple "async for ..."), without "theodore" knowledge you can simply imagine listening for events from different sources. So my need is what is solved by "select", "poll", "epoll", "kqueue", "libevent" on sockets, I need to listen from multiple sources and be notified when one has data. So I wrote an "zip": https://github.com/JulienPalard/TheodoreServer/blob/master/asynczip.py which two modes, one named "SOON", behaving more like select, and one "PAIRED" behaving more like "zip". Given n asynchronous iterables, you can iterate over them using : async for resuts in asynczip.AsyncZip(*iterables) it's really like a "asyncio.wait" but for an iterable, with `wait`'s flag `FIRST_COMPLETED` being my "SOON" flag, and `wait` flag "ALL_COMPLETED" being my "PAIRED" flag (should I rename mines ?) Question is: Am I walking the wrong way and is there a simple way to do it I completly missed ? Or is my tool really usefull and should be shared ? I'm too young in asyncio to juge that, so I'll listen to you ! Bests, -- Julien Palard