subject:"\[ANN\] Userspace M\-on\-N threading model implementation. Alpha release."

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-14 Thread Pavel Machek

Hi!

> >>How does that work? Switching between kernel threads requires going into 
> >>the kernel, user level thread switches are all done in user mode.
> >>
> >>Do you have some way to change o/s threads w/o going into the kernel?
> >>
> >
> >But going into kernel is not very expensive on Linux.
> >
> >On the other side, the overhead you need to add for every single syscall
> >that might block for the M:N threads and the associated complications
> >which make it far harder to conform to POSIX IMHO far outweight the costs
> >of going into the kernel for a context switch.
> 
> That really wasn't my question, Arjan said that switching real threads 
> wasn't a context switch in the hardware sense, and I was asking if I 
> missed something. It may be cheap, but it would seem to be a context 
> switch none-the-less.

It is not reloading %cr3.

Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-04 Thread Arjan van de Ven


> > On the other side, the overhead you need to add for every single syscall
> > that might block for the M:N threads and the associated complications
> > which make it far harder to conform to POSIX IMHO far outweight the costs
> > of going into the kernel for a context switch.
> 
> That really wasn't my question, Arjan said that switching real threads 
> wasn't a context switch in the hardware sense, and I was asking if I 
> missed something.

a hardware context switch is basically a CR3 change with associated tlb
flush. That is the part that is the most expensive of a context switch.
Just going into the kernel and getting out with a different EIP/ESP is
really cheap, in the order of "a few hundred cycles"; not a heck of a
lot more expensive than a simple getpid or other simple system call.


>  It may be cheap, but it would seem to be a context 
> switch none-the-less.

it includes a privilege level switch, not so much a full context
switch...

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-04 Thread Davide Libenzi

On Sun, 4 Feb 2007, Jakub Jelinek wrote:

> On Sun, Feb 04, 2007 at 03:12:32PM -0500, Bill Davidsen wrote:
> > Arjan van de Ven wrote:
> > >>Because user threading can avoid context switches, there will always be 
> > >>cases where it will outperform o/s threads for hardware reasons.
> > >
> > >actually.. switching from one "real" thread to another in Linux is not
> > >an actual context switch in the hardware sense... at least this part of
> > >your argument seems to be incorrect ;)
> > >
> > How does that work? Switching between kernel threads requires going into 
> > the kernel, user level thread switches are all done in user mode.
> > 
> > Do you have some way to change o/s threads w/o going into the kernel?
> 
> But going into kernel is not very expensive on Linux.
> 
> On the other side, the overhead you need to add for every single syscall
> that might block for the M:N threads and the associated complications
> which make it far harder to conform to POSIX IMHO far outweight the costs
> of going into the kernel for a context switch.

Agreed, definitely. A libpcl (using swapcontext(3)) cobench is about 50 
times faster than an context switch measured by lmbench (although I have 
serious doubts about about the ability of lat_ctx to measure it - but 
that's another story) on an Opteron 254. One may say "Wow! Really?!?".
The point is, who cares. We are talking about differences between 
super-fast (~2us) and ultra-fast (~0.04us).
The time (and code) that you'll have to drop in the syscall path to handle 
M:N is very likely to make you lose way more of what you gain by avoiding 
an OS context switch (a "soft" context switch you still have to do it).
Either use N:N (requires locking, but spread over multiple CPUs) or 1:N 
(I/O driven state machines or coroutines - no locking, once-CPU bound).

- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-04 Thread Bill Davidsen


Jakub Jelinek wrote:

On Sun, Feb 04, 2007 at 03:12:32PM -0500, Bill Davidsen wrote:
  

Arjan van de Ven wrote:

Because user threading can avoid context switches, there will always be 
cases where it will outperform o/s threads for hardware reasons.


actually.. switching from one "real" thread to another in Linux is not
an actual context switch in the hardware sense... at least this part of
your argument seems to be incorrect ;)

  
How does that work? Switching between kernel threads requires going into 
the kernel, user level thread switches are all done in user mode.


Do you have some way to change o/s threads w/o going into the kernel?



But going into kernel is not very expensive on Linux.

On the other side, the overhead you need to add for every single syscall
that might block for the M:N threads and the associated complications
which make it far harder to conform to POSIX IMHO far outweight the costs
of going into the kernel for a context switch.


That really wasn't my question, Arjan said that switching real threads 
wasn't a context switch in the hardware sense, and I was asking if I 
missed something. It may be cheap, but it would seem to be a context 
switch none-the-less.


--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-04 Thread Jakub Jelinek

On Sun, Feb 04, 2007 at 03:12:32PM -0500, Bill Davidsen wrote:
> Arjan van de Ven wrote:
> >>Because user threading can avoid context switches, there will always be 
> >>cases where it will outperform o/s threads for hardware reasons.
> >
> >actually.. switching from one "real" thread to another in Linux is not
> >an actual context switch in the hardware sense... at least this part of
> >your argument seems to be incorrect ;)
> >
> How does that work? Switching between kernel threads requires going into 
> the kernel, user level thread switches are all done in user mode.
> 
> Do you have some way to change o/s threads w/o going into the kernel?

But going into kernel is not very expensive on Linux.

On the other side, the overhead you need to add for every single syscall
that might block for the M:N threads and the associated complications
which make it far harder to conform to POSIX IMHO far outweight the costs
of going into the kernel for a context switch.

Jakub
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-04 Thread Bill Davidsen


Arjan van de Ven wrote:
Because user threading can avoid context switches, there will always be 
cases where it will outperform o/s threads for hardware reasons.


actually.. switching from one "real" thread to another in Linux is not
an actual context switch in the hardware sense... at least this part of
your argument seems to be incorrect ;)

How does that work? Switching between kernel threads requires going into 
the kernel, user level thread switches are all done in user mode.


Do you have some way to change o/s threads w/o going into the kernel?

--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-03 Thread Arjan van de Ven


> Because user threading can avoid context switches, there will always be 
> cases where it will outperform o/s threads for hardware reasons.

actually.. switching from one "real" thread to another in Linux is not
an actual context switch in the hardware sense... at least this part of
your argument seems to be incorrect ;)

> 
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-02-02 Thread Bill Davidsen


Kaz Kylheku wrote:

Evgeniy Polyakov wrote:

I described in details why and how M:N model better, and its drawbacks
include all issues mentioned by Ulrich Drepper, but nevertheless its
advantages are far too superiour than those which can be
provided by 1:1
model.


M:N threading is an unnecessary performance hack that's needed by people
who are living in a C or C++ exile away from some language that has
lexical closures, generators or first-class continuations.


Yes, that's called the "real world." Arguments of the "I don't need it, 
in a perfect world you wouldn't either, therefore it's a bad idea" type 
simply contribute nothing.


Because user threading can avoid context switches, there will always be 
cases where it will outperform o/s threads for hardware reasons.


--
Bill Davidsen <[EMAIL PROTECTED]>
  "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-31 Thread Evgeniy Polyakov

On Wed, Jan 31, 2007 at 11:28:07PM -0500, Lee Revell ([EMAIL PROTECTED]) wrote:
> On 1/29/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:
> >1. Scheduling fairness.
> >Since kernel does not know about multiple threads behind given process,
> >it can not add it appropriate number of timeslices for execution.
> >Can be solved either by more tight collaboarion of the userspace and
> >kernelspace schedulers or simply by increasing process' nice value.
> 
> nice value is only meaningful for SCHED_OTHER.  How will you handle a
> multithreaded realtime application that uses SCHED_OTHER as well as
> SCHED_FIFO threads?

Threads created inside one process obviously can not compete with RT
threads created by other process. Instead process, which threads have RT
priority itself should change its priority to RT to compete with other
RT process. (By RT I mean any cases except SCHED_OTHER which is
default).

> Lee

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-31 Thread Lee Revell


On 1/29/07, Evgeniy Polyakov <[EMAIL PROTECTED]> wrote:

1. Scheduling fairness.
Since kernel does not know about multiple threads behind given process,
it can not add it appropriate number of timeslices for execution.
Can be solved either by more tight collaboarion of the userspace and
kernelspace schedulers or simply by increasing process' nice value.


nice value is only meaningful for SCHED_OTHER.  How will you handle a
multithreaded realtime application that uses SCHED_OTHER as well as
SCHED_FIFO threads?

Lee
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-31 Thread Evgeniy Polyakov

On Tue, Jan 30, 2007 at 01:16:22PM -0800, Kaz Kylheku ([EMAIL PROTECTED]) wrote:
> Evgeniy Polyakov wrote:
> > I described in details why and how M:N model better, and its drawbacks
> > include all issues mentioned by Ulrich Drepper, but nevertheless its
> > advantages are far too superiour than those which can be
> > provided by 1:1
> > model.
> 
> M:N threading is an unnecessary performance hack that's needed by people
> who are living in a C or C++ exile away from some language that has
> lexical closures, generators or first-class continuations. Not having
> these niceties, they resort to emulating them with threads. The proper
> thing to do is to rewrite the code to use state machines which can be
> driven by any available thread. Or else, write yourself a
> source-to-source transformer that will give C the lexical closure,
> generator, or continuation features that you need to express the
> solution that way.
> 
> There is no need to retain any vestiges of a user space threading
> implementation when you have the real thing.
> 
> Programs which appear to benefit from that model are badly optimized or
> badly designed. A smartly written program uses an available thread to do
> as much work as possible, until that thread happens to block or its time
> slice burns up.

Do not mix languages like Erlang, specialy designed for concurrent
programming, with M:N threading model - they are completely different,
but you do not want to see this. As you pointed, one thread can do as
much as it need until it is blocked, and what next? Allocate new real
thread? You may want to see how things like JVM work, I seriously doubt
spwning new thread each time task blocks is a way to go. Even having
epoll does not help in many cases. And you forgot the price of
rescheduling in kernelspace and userspace - even with signals it differs
two times, with more intellegent case it differs in 20 times!
Virtual machine can have thousands of threads, actually it cant, since
it will kill Linux in rescheduling.

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-30 Thread Kaz Kylheku

Evgeniy Polyakov wrote:
> I described in details why and how M:N model better, and its drawbacks
> include all issues mentioned by Ulrich Drepper, but nevertheless its
> advantages are far too superiour than those which can be
> provided by 1:1
> model.

M:N threading is an unnecessary performance hack that's needed by people
who are living in a C or C++ exile away from some language that has
lexical closures, generators or first-class continuations. Not having
these niceties, they resort to emulating them with threads. The proper
thing to do is to rewrite the code to use state machines which can be
driven by any available thread. Or else, write yourself a
source-to-source transformer that will give C the lexical closure,
generator, or continuation features that you need to express the
solution that way.

There is no need to retain any vestiges of a user space threading
implementation when you have the real thing.

Programs which appear to benefit from that model are badly optimized or
badly designed. A smartly written program uses an available thread to do
as much work as possible, until that thread happens to block or its time
slice burns up.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-30 Thread Samuel Thibault

Evgeniy Polyakov, le Tue 30 Jan 2007 12:53:16 +0300, a écrit :
> > You may want to have a look at some existing implementations:
> 
> I saw most of them.
> As far as I recall, only PTL (is not shown here) has preemptible
> scheduler. NTL has it too, but is based on different approach.

Marcel has a preemptible scheduler too, based on signals and
setjmp/longjmp.

Samuel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-30 Thread Evgeniy Polyakov

On Tue, Jan 30, 2007 at 11:24:51AM +0100, Samuel Thibault ([EMAIL PROTECTED]) 
wrote:
> Evgeniy Polyakov, le Tue 30 Jan 2007 12:53:16 +0300, a écrit :
> > > You may want to have a look at some existing implementations:
> > 
> > I saw most of them.
> > As far as I recall, only PTL (is not shown here) has preemptible
> > scheduler. NTL has it too, but is based on different approach.
> 
> Marcel has a preemptible scheduler too, based on signals and
> setjmp/longjmp.

Do some documentation and benchmarks exist for that library - site seems
to only described environment is was created for? How does blocking
problem solved?

> Samuel

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-30 Thread Evgeniy Polyakov

On Tue, Jan 30, 2007 at 02:18:17AM +0100, Samuel Thibault ([EMAIL PROTECTED]) 
wrote:
> Hi,
> 
> Evgenity, le Mon 29 Jan 2007 16:47:36 +0100, a écrit :
> > Userspace M-on-N threading model is based on the idea, that when signal
> > is delivered, kernel saves all information related to previous context
> > in stack, so it is possible to find it and replace.
> 
> You may want to have a look at some existing implementations:

I saw most of them.
As far as I recall, only PTL (is not shown here) has preemptible
scheduler. NTL has it too, but is based on different approach.

> - Good old `FSU Pthreads' http://moss.csc.ncsu.edu/~mueller/pthreads/
> - fully POSIX-compliant `GnuPth' http://www.gnu.org/software/pth/
> - server-targetted `Capriccio'
> www.cs.berkeley.edu/~jcondit/capriccio-sosp-2003.pdf
> - efficient `ELiTE/Erlangen'
> http://www4.informatik.uni-erlangen.de/Projects/FORTWIHR/ELiTE/
> - and our portable, flexible, efficient `Marcel'
> http://runtime.futurs.inria.fr/marcel/
> 
> Samuel

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-30 Thread Evgeniy Polyakov

On Mon, Jan 29, 2007 at 10:40:42AM -0600, Chris Friesen ([EMAIL PROTECTED]) 
wrote:
> Evgeniy Polyakov wrote:
> >Hello.
> >
> >I'm pleased to announce initial userspace M-on-N threading model
> >implementation (for hackers) called NTL. 
> 
> If you haven't already, I suggest you look into the story of NGPT and 
> also read the NPTL white paper 
> (http://people.redhat.com/drepper/nptl-design.pdf) especially section 
> 5.1 describing why they went with a 1:1 model.

Of course I read this, but it does not change anything.

NGPT had 2 times slower start time than NPTL, and NTL has 2-20 times
faster than NPTL, so I think NGPT had too major problems to get it 
into comparison.

I described in details why and how M:N model better, and its drawbacks
include all issues mentioned by Ulrich Drepper, but nevertheless its
advantages are far too superiour than those which can be provided by 1:1
model.

> Chris

-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-29 Thread Samuel Thibault

Hi,

Evgenity, le Mon 29 Jan 2007 16:47:36 +0100, a écrit :
> Userspace M-on-N threading model is based on the idea, that when signal
> is delivered, kernel saves all information related to previous context
> in stack, so it is possible to find it and replace.

You may want to have a look at some existing implementations:

- Good old `FSU Pthreads' http://moss.csc.ncsu.edu/~mueller/pthreads/
- fully POSIX-compliant `GnuPth' http://www.gnu.org/software/pth/
- server-targetted `Capriccio'
www.cs.berkeley.edu/~jcondit/capriccio-sosp-2003.pdf
- efficient `ELiTE/Erlangen'
http://www4.informatik.uni-erlangen.de/Projects/FORTWIHR/ELiTE/
- and our portable, flexible, efficient `Marcel'
http://runtime.futurs.inria.fr/marcel/

Samuel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-29 Thread Chris Friesen


Evgeniy Polyakov wrote:

Hello.

I'm pleased to announce initial userspace M-on-N threading model
implementation (for hackers) called NTL. 


If you haven't already, I suggest you look into the story of NGPT and 
also read the NPTL white paper 
(http://people.redhat.com/drepper/nptl-design.pdf) especially section 
5.1 describing why they went with a 1:1 model.


Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-29 Thread Evgeniy Polyakov

P.S. I'm not subscribed to any of the above lists, please Cc: me in
replies.


-- 
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ANN] Userspace M-on-N threading model implementation. Alpha release.

2007-01-29 Thread Evgeniy Polyakov

Hello.

I'm pleased to announce initial userspace M-on-N threading model
implementation (for hackers) called NTL. 
This is first alpha release, which indeed has bugs and limitations.

Userspace M-on-N threading model is based on the idea, that when signal
is delivered, kernel saves all information related to previous context
in stack, so it is possible to find it and replace.

M-on-N threading model compared to usual NPTL 1-on-1 model has following
advantages and disadvantages:

Benefits. 

1. Fast scheduling. 
There is no need to cross userspace/kernelspace boundary to schedule new 
thread execution (just watch what happens with userspace network stack 
compared to kernel's one when there are a lot of syscalls performed for 
small packets receiving/sending).

2. Fast thread creation and destruction.
It just becomes an allocation of the structure in the userspace, no need
for full creation process which is performed in clone() syscall.

3. Smaller number of cache misses.
Since there is only one process instead of several threads, cache
locality is increased greatly with reduced number of misses.

Drawbacks.

1. Scheduling fairness.
Since kernel does not know about multiple threads behind given process,
it can not add it appropriate number of timeslices for execution.
Can be solved either by more tight collaboarion of the userspace and
kernelspace schedulers or simply by increasing process' nice value.

2. All communications are performed through one kevent pipe. (TODO)
All syscalls are going to be converted into non-blocking operations
(including nanosleep() and the like), and keep a track of what each
context performed. In practice glibc rewrite is not what I would like to
do, but instead some layer on top of it will be implemented, which will
convert syscalls into kevent operations, and become a rescheduling
point.

3. Complex code for good SMP scalability and userspace scheduler.
Not a problem. (TESTING)

SMP scalability in M-on-N threading model.

Since only kernel can schedule thread (actually not even thread or
process, but its own kernel's representation, so called kernel's virtual
process) to run on specified CPU, M-on-N threading model should have
several real threads (for example several current POSIX threads), its
number should be equal to number of real CPUs, and then library layer
will schedule execution of context of different real threads, each of
which in turn can run on separate CPU.

So, userspace will create new real threads when pthread_create() is
called until number of them is less than number of real CPUs, each real
thread in turn is a context in the global set of contexts, where fake
context will be added with all subsequent pthread_create() calls, and
userspace scheduler (backed by real threads) will pick up several
contexts from the tree and execute them on the real CPUs.

I would be possible to use existing Linux clone() syscall, but due to
complete absence of hte documentation (which is sometimes plain wrong)
and ery strong encryption of glibc sources it is quite complex task.

As NPTL, M-on-N threading library uses stack rlimit for thread stack
allocation.

Benchmarks.

I only ran simple benchmark of empty thread creation (its function just
exits).
After I started to use atomic locks ("lock" prefix on x86) instead of
semaphores, thread start/empty exec/stop was reduced down to 0.3
microseconds compared to 14 microsecods for POSIX NPTL case.

But there are problems.
First one is that I perform initial context setup through signal
invokation, which is at least two syscalls. They are slow.
Another one is that thread is really started only after rescheduling,
which is another signal, so another two syscalls.
Third on is that there must exist different locking primitives - for
signal context and for process context, which must block signals, which
in turn adds additional overhead of sigprocmask() syscall.

After I fixed all above issues (actually not fixed, but confirmed that
they must exist), performance reduced to 9 microseconds compared to 14
microsecods for POSIX NPTL case for empty thread creation/destruction.

(Test machine is Core Duo 2.4 Ghz (run at 3.7) with 2 GB of ram).

This can be fixed, if I would have created arch-specific
getcontext()-like calls, which would be mutually transformable into
signal context information (existing getcontext() and friends produces
different data than signal context has at least on x86). But I can not
right now, since I do not know enough x86 ABI (I learned a lot for past
several days, as you can notice from this blog, but it is still even
remotely not enough).

Currently M-on-N threading model uses ugly arch-specific hacks to start
new threads, which actually are something remotely similar to
makecontext().
So, the solution, which will rock M-on-N threading implementation is to
convert or create getcontext() and friends calls which can be used with
signal context information.

Another limitations are:
* x86 only (I do not have different te

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

RE: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

Re: [ANN] Userspace M-on-N threading model implementation. Alpha release.

[ANN] Userspace M-on-N threading model implementation. Alpha release.

20 matches

Site Navigation

Mail list logo

Footer information