Re: Kill GIL
Aahz wrote: In article [EMAIL PROTECTED], Frans Englich [EMAIL PROTECTED] wrote: Personally I need a solution which touches this discussion. I need to run multiple processes, which I communicate with via stdin/out, simultaneously, and my plan was to do this with threads. Any favorite document pointers, common traps, or something else which could be good to know? Threads and forks tend to be problematic. This is one case I'd recommend against threads. Multiple threads interacting with stdin/stdout? I've done it with 2 queues. One for feeding the threads input and one for them to use for output. In fact, using queues takes care of the serialization problems generally associated with many threads trying to access a single resource (e.g. stdout). Python Queues are thread-safe so you don't have to worry about such issues. -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Adrian Casey [EMAIL PROTECTED] wrote: Aahz wrote: In article [EMAIL PROTECTED], Frans Englich [EMAIL PROTECTED] wrote: Personally I need a solution which touches this discussion. I need to run multiple processes, which I communicate with via stdin/out, simultaneously, and my plan was to do this with threads. Any favorite document pointers, common traps, or something else which could be good to know? Threads and forks tend to be problematic. This is one case I'd recommend against threads. Multiple threads interacting with stdin/stdout? I've done it with 2 queues. One for feeding the threads input and one for them to use for output. In fact, using queues takes care of the serialization problems generally associated with many threads trying to access a single resource (e.g. stdout). Python Queues are thread-safe so you don't have to worry about such issues. The problem is that each sub-process really needs its own stdin/stdout. Also, to repeat, forking tends to be problematic with threads. Finally, as Peter implied, I'm well-known on c.l.py for responding to thread problems with, Really? Are you using Queue? Why not? However, this is one case where Queue can't help. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Donn Cave wrote: Quoth Dave Brueck [EMAIL PROTECTED]: ... | Another related benefit is that a lot of application state is implicitly and | automatically managed by your local variables when the task is running in a | separate thread, whereas other approaches often end up forcing you to think in | terms of a state machine when you don't really care* and as a by-product you | have to [semi-]manually track the state and state transitions - for some | problems this is fine, for others it's downright tedious. I don't know if the current Stackless implementation has regained any of this ground, but at least of historical interest here, the old one's ability to interrupt, store and resume a computation could be used to As you may know, it used to be, in Stackless Python, that you could have both. Your function would suspend itself, the select loop would resume it, for something like serialized threads. (The newer version of Stackless lost this continuation feature, but for all I know there may be new features that regain some of that ground.) Yep, I follow Stackless development for this very reason. Last I heard, a more automatic scheduler was in the works, without which in can be a little confusing about when non-I/O tasks should get resumed (and by who), but it's not insurmountable. Ideally with Stackless you'd avoid OS threads altogether since the interpreter takes a performance hit with them, but this can be tough if you're e.g. also talking to a database via a blocking API. I put that together with real OS threads once, where the I/O loop was a message queue instead of select. A message queueing multi-threaded architecture can end up just as much a state transition game. Definitely, but for many cases it does not - having each thread represent a distinct worker that pops some item of work off one queue, processes it, and puts it on another queue can really simplify things. Often this maps to real-world objects quite well, additional steps can be inserted or removed easily (and dynamically), and each worker can be developed, tested, and debugged independently. I like threads when they're used in this way, as application components that manage some device-like thing like a socket or a graphic user interface window, interacting through messages. Even then, though, there tend to be a lot of undefined behaviors in events like termination of the main thread, receipt of signals, etc. That's how I tend to like using threads too. In practice I haven't found the undefined behaviors to be too much trouble though, e.g. deciding on common shutdown semantics for all child threads and making them daemon threads pretty much takes care of both expected and unexpected shutdown of the main thread. Usings threads and signals can be confusing and troublesome, but often in cases where I would use them I end up wanting a richer interface anyway so something besides signals is a better fit. -Dave -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Dave Brueck [EMAIL PROTECTED] wrote: Donn Cave wrote: [... re stackless inside-out event loop ] I put that together with real OS threads once, where the I/O loop was a message queue instead of select. A message queueing multi-threaded architecture can end up just as much a state transition game. Definitely, but for many cases it does not - having each thread represent a distinct worker that pops some item of work off one queue, processes it, and puts it on another queue can really simplify things. Often this maps to real-world objects quite well, additional steps can be inserted or removed easily (and dynamically), and each worker can be developed, tested, and debugged independently. Well, one of the things that makes the world interesting is how many different universes we seem to be coming from, but in mine, when I have divided an application into several thread components, about the second time I need to send a message from one thread to another, the sender needs something back in return, as in T2 = from_thread_B(T1). At this point, our conventional procedural model breaks up along a state fault, so to speak, like ... to_thread_B(T1) return def continue_from_T1(T1, T2): ... So, yeah, now I have a model where each thread pops, processes and pushes messages, but only because my program spent the night in Procrustes' inn, not because it was a natural way to write the computation. In a procedural language, anyway - there are interesting alternatives, in particular a functional language called O'Haskell that models threads in a reactive object construct, an odd but elegant mix of state machine and pure functional programming, but it's kind of a research project and I know of nothing along these lines that's really supported today. Donn Cave, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
[EMAIL PROTECTED] (Aahz) writes: (Have you actually written any threaded applications in Python?) Yes. Have you ever asked a polite question? mike -- Mike Meyer [EMAIL PROTECTED] http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] (Aahz) writes: (Have you actually written any threaded applications in Python?) Yes. Have you ever asked a polite question? Yes. I just get a bit irritated with some of the standard lines that people use. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL (was Re: multi threading in multi processor (computer))
[EMAIL PROTECTED] (Aahz) writes: [phr] The day is coming when even cheap computers have multiple cpu's. See hyperthreading and the coming multi-core P4's, and the finally announced Cell processor. Conclusion: the GIL must die. It's not clear to what extent these processors will perform well with shared memory space. One of the things I remember most about Bruce Eckel's discussions of Java and threading is just how broken Java's threading model is in certain respects when it comes to CPU caches failing to maintain cache coherency. Um??? I'm not experienced with multiprocessors but I thought that maintaining cache coherency was a requirement. What's the deal? If coherency isn't maintained, is it really multiprocessing? It's always going to be true that getting fully scaled performance will require more CPUs with non-shared memory -- that's going to mean IPC with multiple processes instead of threads. But unless you use shared memory, the context switch overhead from IPC becomes a bad bottleneck. See http://poshmodule.sourceforge.net/posh/html/node1.html for an interesting scheme of working around the GIL by spreading naturally multi-threaded applications into multiple processes (using shared memory). It would simplify things a lot if you could just use threads. -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Aahz) wrote: Yes. I just get a bit irritated with some of the standard lines that people use. Hey, stop me if you've heard this one: I used threads to solve my problem - and now I have two problems! Donn Cave, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Donn Cave [EMAIL PROTECTED] wrote: In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Aahz) wrote: Yes. I just get a bit irritated with some of the standard lines that people use. Hey, stop me if you've heard this one: I used threads to solve my problem - and now I have two problems! Point to you. ;-) -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Actually, this is one of the cases I was talking about. I find it saner to convert to non-blocking I/O and use select() for synchronization. That solves the problem, without introducing any of the headaches related to shared access and locking that come with threads. Threads aren't always the right entity for dealing with asynchronicity, one might say. C// -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Mike Meyer [EMAIL PROTECTED] writes: Threads are also good for handling blocking I/O. Actually, this is one of the cases I was talking about. I find it saner to convert to non-blocking I/O and use select() for synchronization. That solves the problem, without introducing any of the headaches related to shared access and locking that come with threads. It's just a different style with its own tricks and traps. Threading for blocking i/o is a well-accepted idiom and if Python supports threads at all, people will want to use them that way. -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] (Aahz) writes: In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: Here here. I find that threading typically introduces worse problems than it purports to solve. Threads are also good for handling blocking I/O. Actually, this is one of the cases I was talking about. I find it saner to convert to non-blocking I/O and use select() for synchronization. That solves the problem, without introducing any of the headaches related to shared access and locking that come with threads. It may be saner, but Windows doesn't support select() for file I/O, and Python's threading mechanisms make this very easy. If one's careful with application design, there should be no locking problems. (Have you actually written any threaded applications in Python?) -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
On Monday 14 February 2005 00:53, Aahz wrote: In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: [EMAIL PROTECTED] (Aahz) writes: In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: Here here. I find that threading typically introduces worse problems than it purports to solve. Threads are also good for handling blocking I/O. Actually, this is one of the cases I was talking about. I find it saner to convert to non-blocking I/O and use select() for synchronization. That solves the problem, without introducing any of the headaches related to shared access and locking that come with threads. It may be saner, but Windows doesn't support select() for file I/O, and Python's threading mechanisms make this very easy. If one's careful with application design, there should be no locking problems. (Have you actually written any threaded applications in Python?) Hehe.. the first thing a google search on python non-blocking io threading returns Threading is Evil. Personally I need a solution which touches this discussion. I need to run multiple processes, which I communicate with via stdin/out, simultaneously, and my plan was to do this with threads. Any favorite document pointers, common traps, or something else which could be good to know? Cheers, Frans -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Frans Englich [EMAIL PROTECTED] wrote: Personally I need a solution which touches this discussion. I need to run multiple processes, which I communicate with via stdin/out, simultaneously, and my plan was to do this with threads. Any favorite document pointers, common traps, or something else which could be good to know? Threads and forks tend to be problematic. This is one case I'd recommend against threads. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Mike Meyer wrote: [EMAIL PROTECTED] (Aahz) writes: Threads are also good for handling blocking I/O. Actually, this is one of the cases I was talking about. I find it saner to convert to non-blocking I/O and use select() for synchronization. That solves the problem, without introducing any of the headaches related to shared access and locking that come with threads. Use a communicating sequential processes model for the threading and you don't have many data synchronisation problems because you have barely any shared access - no application data is ever shared between threads, they only send messages to each other via message queues. Most threads simply block on their incoming message queue permanently. Those doing blocking I/O set an appropriate timeout on the I/O call so they can check for messages occasionally. Conveniently, you end up with an architecture that supports switching to multiple processes, or even multiple machines just by changing the transport mechanism used by the message system. (We did exactly this for a GUI application - detached the GUI so it talked to a server via CORBA instead of via direct DLL calls. This meant the server could be ported to a different platform without having to port the far more platform specific GUI. This would have been much harder if we weren't already using a CSP model for communication between different parts of the system) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.skystorm.net -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Mike Meyer wrote: [EMAIL PROTECTED] (Aahz) writes: In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: Here here. I find that threading typically introduces worse problems than it purports to solve. Threads are also good for handling blocking I/O. Actually, this is one of the cases I was talking about. I find it saner to convert to non-blocking I/O and use select() for synchronization. That solves the problem, without introducing any of the headaches related to shared access and locking that come with threads. This whole tangent to the original thread intrigues me - I've found that if you're going to use threads in any language, Python is the one to use because the GIL reduces so many of the problems common to multithreaded programming (I'm not a huge fan of the GIL, but its presence effectively prevents a pure Python multithreaded app from corrupting the interpreter, which is especially handy for those just learning Python or programming). I've done a lot of networking applications using select/poll (usually for performance reasons) and found that going that route *can* in some cases simplify things but it requires looking at the problem differently, often from perspectives that seem unnatural to me - it's not just an implementation detail but one you have to consider during design. One nice thing about using threads is that components of your application that are logically separate can remain separate in the code as well - the implementations don't have to be tied together at some common dispatch loop, and a failure to be completely non-blocking in one component doesn't necessarily spell disaster for the entire app (I've had apps in production where one thread would die or get hung but I was relieved to find out that the main service remained available). Another related benefit is that a lot of application state is implicitly and automatically managed by your local variables when the task is running in a separate thread, whereas other approaches often end up forcing you to think in terms of a state machine when you don't really care* and as a by-product you have to [semi-]manually track the state and state transitions - for some problems this is fine, for others it's downright tedious. Anyway, if someone doesn't know about alternatives to threads, then that's a shame as other approaches have their advantages (often including a certain elegance that is just darn *cool*), but I wouldn't shy away from threads too much either - especially in Python. -Dave * Simple case in point: a non-blocking logging facility. In Python you can just start up a thread that pops strings off a Queue object and writes them to an open file. A non-threaded version is more complicated to implement, debug, and maintain. -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Quoth Dave Brueck [EMAIL PROTECTED]: ... | Another related benefit is that a lot of application state is implicitly and | automatically managed by your local variables when the task is running in a | separate thread, whereas other approaches often end up forcing you to think in | terms of a state machine when you don't really care* and as a by-product you | have to [semi-]manually track the state and state transitions - for some | problems this is fine, for others it's downright tedious. I don't know if the current Stackless implementation has regained any of this ground, but at least of historical interest here, the old one's ability to interrupt, store and resume a computation could be used to As you may know, it used to be, in Stackless Python, that you could have both. Your function would suspend itself, the select loop would resume it, for something like serialized threads. (The newer version of Stackless lost this continuation feature, but for all I know there may be new features that regain some of that ground.) I put that together with real OS threads once, where the I/O loop was a message queue instead of select. A message queueing multi-threaded architecture can end up just as much a state transition game. I like threads when they're used in this way, as application components that manage some device-like thing like a socket or a graphic user interface window, interacting through messages. Even then, though, there tend to be a lot of undefined behaviors in events like termination of the main thread, receipt of signals, etc. Donn Cave, [EMAIL PROTECTED] -- http://mail.python.org/mailman/listinfo/python-list
Kill GIL (was Re: multi threading in multi processor (computer))
In article [EMAIL PROTECTED], Paul Rubin http://[EMAIL PROTECTED] wrote: The day is coming when even cheap computers have multiple cpu's. See hyperthreading and the coming multi-core P4's, and the finally announced Cell processor. Conclusion: the GIL must die. It's not clear to what extent these processors will perform well with shared memory space. One of the things I remember most about Bruce Eckel's discussions of Java and threading is just how broken Java's threading model is in certain respects when it comes to CPU caches failing to maintain cache coherency. It's always going to be true that getting fully scaled performance will require more CPUs with non-shared memory -- that's going to mean IPC with multiple processes instead of threads. Don't get me wrong -- I'm probably one of the bigger boosters of threads. But it bugs me when people think that getting rid of the GIL will be the Holy Grail of Python performance. No way. No how. No time. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL (was Re: multi threading in multi processor (computer))
On Sat, Feb 12, 2005 at 07:13:17PM -0500, Aahz wrote: In article [EMAIL PROTECTED], Paul Rubin http://[EMAIL PROTECTED] wrote: The day is coming when even cheap computers have multiple cpu's. See hyperthreading and the coming multi-core P4's, and the finally announced Cell processor. I'm looking forward to Multi-core P4s (and Opterons). The Cell is a non-starter for general purpose computing. Arstechnica has a couple good pieces on it, the upshot is that it is one normal processor with eight strange floating point co-processors hanging off it. Conclusion: the GIL must die. It's not clear to what extent these processors will perform well with shared memory space. One of the things I remember most about Bruce Eckel's discussions of Java and threading is just how broken Java's threading model is in certain respects when it comes to CPU caches failing to maintain cache coherency. It's always going to be true that getting fully scaled performance will require more CPUs with non-shared memory -- that's going to mean IPC with multiple processes instead of threads. Don't get me wrong -- I'm probably one of the bigger boosters of threads. But it bugs me when people think that getting rid of the GIL will be the Holy Grail of Python performance. No way. No how. No time. Me Too! for a small number of processors (four) it is easier (and usually sufficient) to pipeline functional parts into different processes than it is to thread the whole monkey. As a bonus this usually gives you scaling across machines (and not just CPUs) for cheap. I'm aware there are some problems where this isn't true. From reading this thread every couple months on c.l.py for the last few years it is my opinion that the number of people who think threading is the only solution to their problem greatly outnumber the number of people who actually have such a problem (like, nearly all of them). Killing the GIL is proposing a silver bullet where there is no werewolf-ly, -Jack -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL (was Re: multi threading in multi processor (computer))
Killing the GIL is proposing a silver bullet where there is no werewolf-ly, About the only reason for killing the GIL is /us/. We, purists, pythonistas, language nuts, or what not, who for some reason or other simply hate the idea of the GIL. I'd view it as an artistic desire, unurgent, something to plan for the future canvas upon which our painting is painted... C// -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Jack Diederich [EMAIL PROTECTED] writes: From reading this thread every couple months on c.l.py for the last few years it is my opinion that the number of people who think threading is the only solution to their problem greatly outnumber the number of people who actually have such a problem (like, nearly all of them). Here here. I find that threading typically introduces worse problems than it purports to solve. mike -- Mike Meyer [EMAIL PROTECTED] http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information. -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Here here. I find that threading typically introduces worse problems than it purports to solve. I recently worked on a software effort, arguably one of the most important software efforts in existence, in which individuals responsible for critical performance of the application threw arbitrarily large numbers of threads at a problem, on a multi processor machine, on a problem that was intrinsically IO-bound. The ease with which one can get into problems with threads (and these days, also with network comms) leads to many problems if the engineers aren't acquainted sufficiently with the theory. Don't get me started on the big clusterfucks I've seen evolve from CORBA... C// -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
Mike Meyer wrote: Jack Diederich [EMAIL PROTECTED] writes: From reading this thread every couple months on c.l.py for the last few years it is my opinion that the number of people who think threading is the only solution to their problem greatly outnumber the number of people who actually have such a problem (like, nearly all of them). Here here. I find that threading typically introduces worse problems than it purports to solve. In my experience, threads should mainly be used if you need asynchronous access to a synchronous operation. You spawn the thread to make the call, it blocks on the relevant API, then notifies the main thread when it's done. Since any sane code will release the GIL before making the blocking call, this scales to multiple CPU's just fine. Another justification for threads is when you have a multi-CPU machine, and a processor intensive operation you'd like to farm off to a separate CPU. In that case, you can treat the long-running operation like any other synchronous call, and farm off a thread that releases the GIL before start the time-consuming operation. The only time the GIL gets in the way is if the long-running operation you want to farm off is itself implemented in Python. However, consider this: threads run on a CPU, so if you want to run multiple threads concurrently, you either need multiple CPU's or a time-slicing scheduler that fakes it. Here's the trick: PYTHON THREADS DO NOT RUN DIRECTLY ON THE CPU. Instead, they run on a Python Virtual Machine (or the JVM/CLR Runtime/whatever), which then runs on the CPU. So, if you want to run multiple Python threads concurrently, you need multiple PVM's or a timeslicing scheduler. The GIL represents the latter. Now, Python *could* try to provide the ability to have multiple virtual machines in a single process in order to more effectively exploit multiple CPU's. I have no idea if Java or the CLR work that way - my guess it that they do (or something that looks the same from a programmer's POV). But then, they have Sun/Microsoft directly financing the development teams. A much simpler suggestion is that if you want a new PVM, just create a new OS process to run another copy of the Python interpreter. The effectiveness of your multi-CPU utilisation will then be governed by your OS's ability to correctly schedule multiple processes rather than by the PVM's ability to fake multiple processes using threads (Hint: the former is likely to be much better than the latter). Additionally, schemes for inter-process communication are often far more scaleable than those for inter-thread communication, since the former generally can't rely on shared memory (although good versions may utilise it for optimisation purposes). This means they can usually be applied to clustered computing rather effectively. I would *far* prefer to see effort expended on making the idiom mentioned in the last couple of paragraphs simple and easy to use, rather than on a misguided effort to Kill the GIL. Cheers, Nick. P.S. If the GIL *really* bothers you, check out Stackless Python. As I understand it, it does its best to avoid the C stack (and hence threads) altogether. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://boredomandlaziness.skystorm.net -- http://mail.python.org/mailman/listinfo/python-list
Re: Kill GIL
In article [EMAIL PROTECTED], Mike Meyer [EMAIL PROTECTED] wrote: Here here. I find that threading typically introduces worse problems than it purports to solve. Depends what you're trying to do with threads. Threads are definitely a good technique for managing long-running work in a GUI application. Threads are also good for handling blocking I/O. Threads can in theory be useful for computational processing, but Python provides almost no support for that. -- Aahz ([EMAIL PROTECTED]) * http://www.pythoncraft.com/ The joy of coding Python should be in seeing short, concise, readable classes that express a lot of action in a small amount of clear code -- not in reams of trivial code that bores the reader to death. --GvR -- http://mail.python.org/mailman/listinfo/python-list