Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Hi! > >>How does that work? Switching between kernel threads requires going into > >>the kernel, user level thread switches are all done in user mode. > >> > >>Do you have some way to change o/s threads w/o going into the kernel? > >> > > > >But going into kernel is not very expensive on Linux. > > > >On the other side, the overhead you need to add for every single syscall > >that might block for the M:N threads and the associated complications > >which make it far harder to conform to POSIX IMHO far outweight the costs > >of going into the kernel for a context switch. > > That really wasn't my question, Arjan said that switching real threads > wasn't a context switch in the hardware sense, and I was asking if I > missed something. It may be cheap, but it would seem to be a context > switch none-the-less. It is not reloading %cr3. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
> > On the other side, the overhead you need to add for every single syscall > > that might block for the M:N threads and the associated complications > > which make it far harder to conform to POSIX IMHO far outweight the costs > > of going into the kernel for a context switch. > > That really wasn't my question, Arjan said that switching real threads > wasn't a context switch in the hardware sense, and I was asking if I > missed something. a hardware context switch is basically a CR3 change with associated tlb flush. That is the part that is the most expensive of a context switch. Just going into the kernel and getting out with a different EIP/ESP is really cheap, in the order of "a few hundred cycles"; not a heck of a lot more expensive than a simple getpid or other simple system call. > It may be cheap, but it would seem to be a context > switch none-the-less. it includes a privilege level switch, not so much a full context switch... -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Sun, 4 Feb 2007, Jakub Jelinek wrote: > On Sun, Feb 04, 2007 at 03:12:32PM -0500, Bill Davidsen wrote: > > Arjan van de Ven wrote: > > >>Because user threading can avoid context switches, there will always be > > >>cases where it will outperform o/s threads for hardware reasons. > > > > > >actually.. switching from one "real" thread to another in Linux is not > > >an actual context switch in the hardware sense... at least this part of > > >your argument seems to be incorrect ;) > > > > > How does that work? Switching between kernel threads requires going into > > the kernel, user level thread switches are all done in user mode. > > > > Do you have some way to change o/s threads w/o going into the kernel? > > But going into kernel is not very expensive on Linux. > > On the other side, the overhead you need to add for every single syscall > that might block for the M:N threads and the associated complications > which make it far harder to conform to POSIX IMHO far outweight the costs > of going into the kernel for a context switch. Agreed, definitely. A libpcl (using swapcontext(3)) cobench is about 50 times faster than an context switch measured by lmbench (although I have serious doubts about about the ability of lat_ctx to measure it - but that's another story) on an Opteron 254. One may say "Wow! Really?!?". The point is, who cares. We are talking about differences between super-fast (~2us) and ultra-fast (~0.04us). The time (and code) that you'll have to drop in the syscall path to handle M:N is very likely to make you lose way more of what you gain by avoiding an OS context switch (a "soft" context switch you still have to do it). Either use N:N (requires locking, but spread over multiple CPUs) or 1:N (I/O driven state machines or coroutines - no locking, once-CPU bound). - Davide - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Jakub Jelinek wrote: On Sun, Feb 04, 2007 at 03:12:32PM -0500, Bill Davidsen wrote: Arjan van de Ven wrote: Because user threading can avoid context switches, there will always be cases where it will outperform o/s threads for hardware reasons. actually.. switching from one "real" thread to another in Linux is not an actual context switch in the hardware sense... at least this part of your argument seems to be incorrect ;) How does that work? Switching between kernel threads requires going into the kernel, user level thread switches are all done in user mode. Do you have some way to change o/s threads w/o going into the kernel? But going into kernel is not very expensive on Linux. On the other side, the overhead you need to add for every single syscall that might block for the M:N threads and the associated complications which make it far harder to conform to POSIX IMHO far outweight the costs of going into the kernel for a context switch. That really wasn't my question, Arjan said that switching real threads wasn't a context switch in the hardware sense, and I was asking if I missed something. It may be cheap, but it would seem to be a context switch none-the-less. -- bill davidsen <[EMAIL PROTECTED]> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Sun, Feb 04, 2007 at 03:12:32PM -0500, Bill Davidsen wrote: > Arjan van de Ven wrote: > >>Because user threading can avoid context switches, there will always be > >>cases where it will outperform o/s threads for hardware reasons. > > > >actually.. switching from one "real" thread to another in Linux is not > >an actual context switch in the hardware sense... at least this part of > >your argument seems to be incorrect ;) > > > How does that work? Switching between kernel threads requires going into > the kernel, user level thread switches are all done in user mode. > > Do you have some way to change o/s threads w/o going into the kernel? But going into kernel is not very expensive on Linux. On the other side, the overhead you need to add for every single syscall that might block for the M:N threads and the associated complications which make it far harder to conform to POSIX IMHO far outweight the costs of going into the kernel for a context switch. Jakub - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Arjan van de Ven wrote: Because user threading can avoid context switches, there will always be cases where it will outperform o/s threads for hardware reasons. actually.. switching from one "real" thread to another in Linux is not an actual context switch in the hardware sense... at least this part of your argument seems to be incorrect ;) How does that work? Switching between kernel threads requires going into the kernel, user level thread switches are all done in user mode. Do you have some way to change o/s threads w/o going into the kernel? -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
> Because user threading can avoid context switches, there will always be > cases where it will outperform o/s threads for hardware reasons. actually.. switching from one "real" thread to another in Linux is not an actual context switch in the hardware sense... at least this part of your argument seems to be incorrect ;) > -- if you want to mail me at work (you don't), use arjan (at) linux.intel.com Test the interaction between Linux and your BIOS via http://www.linuxfirmwarekit.org - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Kaz Kylheku wrote: Evgeniy Polyakov wrote: I described in details why and how M:N model better, and its drawbacks include all issues mentioned by Ulrich Drepper, but nevertheless its advantages are far too superiour than those which can be provided by 1:1 model. M:N threading is an unnecessary performance hack that's needed by people who are living in a C or C++ exile away from some language that has lexical closures, generators or first-class continuations. Yes, that's called the "real world." Arguments of the "I don't need it, in a perfect world you wouldn't either, therefore it's a bad idea" type simply contribute nothing. Because user threading can avoid context switches, there will always be cases where it will outperform o/s threads for hardware reasons. -- Bill Davidsen <[EMAIL PROTECTED]> "We have more to fear from the bungling of the incompetent than from the machinations of the wicked." - from Slashdot - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Wed, Jan 31, 2007 at 11:28:07PM -0500, Lee Revell ([EMAIL PROTECTED]) wrote: > On 1/29/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: > >1. Scheduling fairness. > >Since kernel does not know about multiple threads behind given process, > >it can not add it appropriate number of timeslices for execution. > >Can be solved either by more tight collaboarion of the userspace and > >kernelspace schedulers or simply by increasing process' nice value. > > nice value is only meaningful for SCHED_OTHER. How will you handle a > multithreaded realtime application that uses SCHED_OTHER as well as > SCHED_FIFO threads? Threads created inside one process obviously can not compete with RT threads created by other process. Instead process, which threads have RT priority itself should change its priority to RT to compete with other RT process. (By RT I mean any cases except SCHED_OTHER which is default). > Lee -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On 1/29/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote: 1. Scheduling fairness. Since kernel does not know about multiple threads behind given process, it can not add it appropriate number of timeslices for execution. Can be solved either by more tight collaboarion of the userspace and kernelspace schedulers or simply by increasing process' nice value. nice value is only meaningful for SCHED_OTHER. How will you handle a multithreaded realtime application that uses SCHED_OTHER as well as SCHED_FIFO threads? Lee - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Tue, Jan 30, 2007 at 01:16:22PM -0800, Kaz Kylheku ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov wrote: > > I described in details why and how M:N model better, and its drawbacks > > include all issues mentioned by Ulrich Drepper, but nevertheless its > > advantages are far too superiour than those which can be > > provided by 1:1 > > model. > > M:N threading is an unnecessary performance hack that's needed by people > who are living in a C or C++ exile away from some language that has > lexical closures, generators or first-class continuations. Not having > these niceties, they resort to emulating them with threads. The proper > thing to do is to rewrite the code to use state machines which can be > driven by any available thread. Or else, write yourself a > source-to-source transformer that will give C the lexical closure, > generator, or continuation features that you need to express the > solution that way. > > There is no need to retain any vestiges of a user space threading > implementation when you have the real thing. > > Programs which appear to benefit from that model are badly optimized or > badly designed. A smartly written program uses an available thread to do > as much work as possible, until that thread happens to block or its time > slice burns up. Do not mix languages like Erlang, specialy designed for concurrent programming, with M:N threading model - they are completely different, but you do not want to see this. As you pointed, one thread can do as much as it need until it is blocked, and what next? Allocate new real thread? You may want to see how things like JVM work, I seriously doubt spwning new thread each time task blocks is a way to go. Even having epoll does not help in many cases. And you forgot the price of rescheduling in kernelspace and userspace - even with signals it differs two times, with more intellegent case it differs in 20 times! Virtual machine can have thousands of threads, actually it cant, since it will kill Linux in rescheduling. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Evgeniy Polyakov wrote: > I described in details why and how M:N model better, and its drawbacks > include all issues mentioned by Ulrich Drepper, but nevertheless its > advantages are far too superiour than those which can be > provided by 1:1 > model. M:N threading is an unnecessary performance hack that's needed by people who are living in a C or C++ exile away from some language that has lexical closures, generators or first-class continuations. Not having these niceties, they resort to emulating them with threads. The proper thing to do is to rewrite the code to use state machines which can be driven by any available thread. Or else, write yourself a source-to-source transformer that will give C the lexical closure, generator, or continuation features that you need to express the solution that way. There is no need to retain any vestiges of a user space threading implementation when you have the real thing. Programs which appear to benefit from that model are badly optimized or badly designed. A smartly written program uses an available thread to do as much work as possible, until that thread happens to block or its time slice burns up. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Evgeniy Polyakov, le Tue 30 Jan 2007 12:53:16 +0300, a écrit : > > You may want to have a look at some existing implementations: > > I saw most of them. > As far as I recall, only PTL (is not shown here) has preemptible > scheduler. NTL has it too, but is based on different approach. Marcel has a preemptible scheduler too, based on signals and setjmp/longjmp. Samuel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Tue, Jan 30, 2007 at 11:24:51AM +0100, Samuel Thibault ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov, le Tue 30 Jan 2007 12:53:16 +0300, a écrit : > > > You may want to have a look at some existing implementations: > > > > I saw most of them. > > As far as I recall, only PTL (is not shown here) has preemptible > > scheduler. NTL has it too, but is based on different approach. > > Marcel has a preemptible scheduler too, based on signals and > setjmp/longjmp. Do some documentation and benchmarks exist for that library - site seems to only described environment is was created for? How does blocking problem solved? > Samuel -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Tue, Jan 30, 2007 at 02:18:17AM +0100, Samuel Thibault ([EMAIL PROTECTED]) wrote: > Hi, > > Evgenity, le Mon 29 Jan 2007 16:47:36 +0100, a écrit : > > Userspace M-on-N threading model is based on the idea, that when signal > > is delivered, kernel saves all information related to previous context > > in stack, so it is possible to find it and replace. > > You may want to have a look at some existing implementations: I saw most of them. As far as I recall, only PTL (is not shown here) has preemptible scheduler. NTL has it too, but is based on different approach. > - Good old `FSU Pthreads' http://moss.csc.ncsu.edu/~mueller/pthreads/ > - fully POSIX-compliant `GnuPth' http://www.gnu.org/software/pth/ > - server-targetted `Capriccio' > www.cs.berkeley.edu/~jcondit/capriccio-sosp-2003.pdf > - efficient `ELiTE/Erlangen' > http://www4.informatik.uni-erlangen.de/Projects/FORTWIHR/ELiTE/ > - and our portable, flexible, efficient `Marcel' > http://runtime.futurs.inria.fr/marcel/ > > Samuel -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
On Mon, Jan 29, 2007 at 10:40:42AM -0600, Chris Friesen ([EMAIL PROTECTED]) wrote: > Evgeniy Polyakov wrote: > >Hello. > > > >I'm pleased to announce initial userspace M-on-N threading model > >implementation (for hackers) called NTL. > > If you haven't already, I suggest you look into the story of NGPT and > also read the NPTL white paper > (http://people.redhat.com/drepper/nptl-design.pdf) especially section > 5.1 describing why they went with a 1:1 model. Of course I read this, but it does not change anything. NGPT had 2 times slower start time than NPTL, and NTL has 2-20 times faster than NPTL, so I think NGPT had too major problems to get it into comparison. I described in details why and how M:N model better, and its drawbacks include all issues mentioned by Ulrich Drepper, but nevertheless its advantages are far too superiour than those which can be provided by 1:1 model. > Chris -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Hi, Evgenity, le Mon 29 Jan 2007 16:47:36 +0100, a écrit : > Userspace M-on-N threading model is based on the idea, that when signal > is delivered, kernel saves all information related to previous context > in stack, so it is possible to find it and replace. You may want to have a look at some existing implementations: - Good old `FSU Pthreads' http://moss.csc.ncsu.edu/~mueller/pthreads/ - fully POSIX-compliant `GnuPth' http://www.gnu.org/software/pth/ - server-targetted `Capriccio' www.cs.berkeley.edu/~jcondit/capriccio-sosp-2003.pdf - efficient `ELiTE/Erlangen' http://www4.informatik.uni-erlangen.de/Projects/FORTWIHR/ELiTE/ - and our portable, flexible, efficient `Marcel' http://runtime.futurs.inria.fr/marcel/ Samuel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
Evgeniy Polyakov wrote: Hello. I'm pleased to announce initial userspace M-on-N threading model implementation (for hackers) called NTL. If you haven't already, I suggest you look into the story of NGPT and also read the NPTL white paper (http://people.redhat.com/drepper/nptl-design.pdf) especially section 5.1 describing why they went with a 1:1 model. Chris - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.
P.S. I'm not subscribed to any of the above lists, please Cc: me in replies. -- Evgeniy Polyakov - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANN] Userspace M-on-N threading model implementation. Alpha release.
Hello. I'm pleased to announce initial userspace M-on-N threading model implementation (for hackers) called NTL. This is first alpha release, which indeed has bugs and limitations. Userspace M-on-N threading model is based on the idea, that when signal is delivered, kernel saves all information related to previous context in stack, so it is possible to find it and replace. M-on-N threading model compared to usual NPTL 1-on-1 model has following advantages and disadvantages: Benefits. 1. Fast scheduling. There is no need to cross userspace/kernelspace boundary to schedule new thread execution (just watch what happens with userspace network stack compared to kernel's one when there are a lot of syscalls performed for small packets receiving/sending). 2. Fast thread creation and destruction. It just becomes an allocation of the structure in the userspace, no need for full creation process which is performed in clone() syscall. 3. Smaller number of cache misses. Since there is only one process instead of several threads, cache locality is increased greatly with reduced number of misses. Drawbacks. 1. Scheduling fairness. Since kernel does not know about multiple threads behind given process, it can not add it appropriate number of timeslices for execution. Can be solved either by more tight collaboarion of the userspace and kernelspace schedulers or simply by increasing process' nice value. 2. All communications are performed through one kevent pipe. (TODO) All syscalls are going to be converted into non-blocking operations (including nanosleep() and the like), and keep a track of what each context performed. In practice glibc rewrite is not what I would like to do, but instead some layer on top of it will be implemented, which will convert syscalls into kevent operations, and become a rescheduling point. 3. Complex code for good SMP scalability and userspace scheduler. Not a problem. (TESTING) SMP scalability in M-on-N threading model. Since only kernel can schedule thread (actually not even thread or process, but its own kernel's representation, so called kernel's virtual process) to run on specified CPU, M-on-N threading model should have several real threads (for example several current POSIX threads), its number should be equal to number of real CPUs, and then library layer will schedule execution of context of different real threads, each of which in turn can run on separate CPU. So, userspace will create new real threads when pthread_create() is called until number of them is less than number of real CPUs, each real thread in turn is a context in the global set of contexts, where fake context will be added with all subsequent pthread_create() calls, and userspace scheduler (backed by real threads) will pick up several contexts from the tree and execute them on the real CPUs. I would be possible to use existing Linux clone() syscall, but due to complete absence of hte documentation (which is sometimes plain wrong) and ery strong encryption of glibc sources it is quite complex task. As NPTL, M-on-N threading library uses stack rlimit for thread stack allocation. Benchmarks. I only ran simple benchmark of empty thread creation (its function just exits). After I started to use atomic locks ("lock" prefix on x86) instead of semaphores, thread start/empty exec/stop was reduced down to 0.3 microseconds compared to 14 microsecods for POSIX NPTL case. But there are problems. First one is that I perform initial context setup through signal invokation, which is at least two syscalls. They are slow. Another one is that thread is really started only after rescheduling, which is another signal, so another two syscalls. Third on is that there must exist different locking primitives - for signal context and for process context, which must block signals, which in turn adds additional overhead of sigprocmask() syscall. After I fixed all above issues (actually not fixed, but confirmed that they must exist), performance reduced to 9 microseconds compared to 14 microsecods for POSIX NPTL case for empty thread creation/destruction. (Test machine is Core Duo 2.4 Ghz (run at 3.7) with 2 GB of ram). This can be fixed, if I would have created arch-specific getcontext()-like calls, which would be mutually transformable into signal context information (existing getcontext() and friends produces different data than signal context has at least on x86). But I can not right now, since I do not know enough x86 ABI (I learned a lot for past several days, as you can notice from this blog, but it is still even remotely not enough). Currently M-on-N threading model uses ugly arch-specific hacks to start new threads, which actually are something remotely similar to makecontext(). So, the solution, which will rock M-on-N threading implementation is to convert or create getcontext() and friends calls which can be used with signal context information. Another limitations are: * x86 only (I do not have different te