Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Scott Lamb
Christopher Layne wrote:
> On Sun, Nov 04, 2007 at 04:23:01PM -0800, Scott Lamb wrote:
>>> It wasn't what I expected; I was fully confident at first that the
>>> thread-pool, work-queue model would be the way to go, since it's one
>>> I've implemented in many applications in the past. But the numbers said
>>> otherwise.
>> Thanks for the case study. To rephrase (hopefully correctly), you tried
>> these two models:
>>
>> 1) one thread polls and puts events on a queue; a bunch of other threads
>> pull from the queue. (resulted in high latency, and I'm not too
>> surprised...an extra context switch before handling any events.)
> 
> So back to this..
> 
>> 2) a bunch of threads read and handle events independently. (your
>> current model.)
> 
> BTW: How does this model somehow exempt itself from said context switching
> issue of the former?

Hmm, William Ahern says that at least on Linux, they only wake one
thread per event. That would explain it.

>> Did you also tried the so-called "leader/follower" model, in which the
>> thread which does the polling handles the first event and puts the rest
>> on a queue; another thread takes over polling if otherwise idle while
>> the first thread is still working. My impression this was a widely
>> favored model, though I don't know the details of where each performs best.
> 
> Something about this just seems like smoke and mirrors to me. At the end of
> the day we still only have a finite amount of CPU cores available to us and
> any amount of playing with the order of things is not going to extract any
> magical *more* throughput out of a given box. Yes, some of these methods
> influence recv/send buffers and have a cascading effect on overall throughput,
> but efficient code and algorithms are going to make the real difference - not
> goofy thread games.
> 
> (and this is coming from someone who *likes* comp.programming.threads)

Oh, I don't know, there is something to be said for not making a handoff
between threads if you can avoid it. You're not going to get more
throughput than n_cores times what you got with one processor, but I'd
expect avoiding context switches and cache bouncing to help you get
closer to that.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Scott Lamb
Scott Lamb wrote:
>>> * What's your use case for ev_loop_new() and ev_loop_default()'s bitmask
>>> of allowed implementations?
>> libevents unconditional use of getenv raised concerns with me and
>> apperently some users on this list, too, so this is one way to disable
>> this (EVMETHOD_ANY instead of EVMETHOD_ALL). Also, I am sure some apps
>> want control over the allowed event loops, e.g. to rule out select becasue
>> it is known to be not wrokign for them.
> 
> Ahh, I'd have to agree that getenv() seems sketchy. But with the
> interface you've supplied, you can't simply blacklist select() without

Hmm. I was going to check this out more and forgot before sending the email.

Realistically, I think unit tests and bug reports/workarounds are about
the only reason to blacklist specific event dispatchers. select() sucks,
but well, that's why it's at the bottom of the list, right?

If you are really concerned about syscall overhead (you mentioned it in
another context, though I don't think it should be a major concern), you
might want to prefer poll() to epoll() or kqueue() if you're going to
use a loop a couple times on a small number of fds then throw it away.
But that's a situation where you'd want a little more coarseness: low
startup overhead methods vs. low per-socket overhead ones. Resorting to
naming specific event mechanisms seems undesirable in terms of older
programs taking advantage of newer event facilities.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Scott Lamb
Marc Lehmann wrote:
>> That seems rather undesirable for many applications.
> 
> Well, its arguably better than libevents behaviour, which is simply
> returning from event_loop, leaving the app unclear on what has happened
> and what to do.
> 
> In any case, you can get the same behaviour as libevent by calling unloop
> in case of an error, so the interface is strictly more powerful.

Another reason this is undesirable: to get the libevent behavior, I'd
have to have this logic in *every callback* to get the same behavior.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Scott Lamb
Marc Lehmann wrote:
> On Sun, Nov 04, 2007 at 04:09:08PM -0800, Scott Lamb <[EMAIL PROTECTED]> 
> wrote:
>> Have you seen the new Linux timerfd API? Where available, you can wait
>> for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats
>> heuristics,
>
> How? I still need to detect time jumps. If my ev_periodic is to be scheduled
> every minute, on the minute, and somebody resets the time the timer needs to
> be rescheduled. With timerfd I would need to detetc that and remove/insert
> the timer again.

My impression was you could create such a timer like this:

struct itimerspec it = {
.it_value= { .tv_sec = 0,  .tv_nsec = 0 },
.it_interval = { .tv_sec = 60, .tv_nsec = 0 },
};

sts = timerfd(my_timerfd, CLOCK_REALTIME, TFD_TIMER_ABSTIME, &it);

and any adjustments are the kernel's responsibility. It is aware of the
adjtime() calls and such as they happen.

I have not tested this myself.

> (You might have no use for periodics for timeouts, but they are designed
> to solve this very problem :)
> 
> Besides, having a syscall per timer (or even timer change) would be an
> enourmous overhead for many workloads.

Really? Syscalls aren't really *that* expensive in and of themselves,
and it'd be a rare application that makes an order of magnitude more
timerfd() calls than read()/write() calls.

> 
>> and detecting time jumps sound like introducing a lot of extra
>> timeouts.
> 
> I don't quite see how that would happen with either timer event currently in
> libev, unless the user code forces it.
> 
>> I'd hate to see libev(ent)? show up on PowerTOP after just getting rid
>> of the 5-second timeout.
> 
> Now idea what powertop is, but sporiadic servers might use a lot of cpu
> without the kernel ever realising it :)

PowerTOP is a Linux tool for seeing the top causes of timeouts. Recently
people are trying to make better use of processor idle states, so it's
important to not fire timeouts unnecessarily.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Christopher Layne
On Sun, Nov 04, 2007 at 09:37:52PM -0800, Scott Lamb wrote:
> For this program, it's important to know more than that an error has
> occurred. EV_ERROR is totally inadequate. You're using it for several
> different cases. I spotted at least these three:
> 
> * malloc() failed in ev_once - transient runtime error.
> * select() failed with ENOMEM, so libev chose to kill this file
> descriptor and now is notifying userspace.
> * bad file descriptor - probably a logic error.
> 
> What is my program supposed to do? It can't distinguish them, and the
> correct behavior in each of these conditions is totally different. Also,
> in the program I'm thinking of, "libev chose to kill this file
> descriptor" probably means a network link just went down. Ergh.

Great point. It should back away and leave things alone - notifying
to the caller (or whoever else is listening) "this is not acceptable to
me - I suggest you fix it - because I won't" (aka unix way).

> No, on error your library may have muddied the water already by screwing
> with the file descriptors. libevent also makes errors clearer by simply
> returning error from the failed function (I'm thinking of event_once()
> vs ev_once() here.)

Agreed. A library should touch as little as necessary to get the job done.

-cl
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Christopher Layne
On Sun, Nov 04, 2007 at 04:23:01PM -0800, Scott Lamb wrote:
> > It wasn't what I expected; I was fully confident at first that the
> > thread-pool, work-queue model would be the way to go, since it's one
> > I've implemented in many applications in the past. But the numbers said
> > otherwise.
> 
> Thanks for the case study. To rephrase (hopefully correctly), you tried
> these two models:
> 
> 1) one thread polls and puts events on a queue; a bunch of other threads
> pull from the queue. (resulted in high latency, and I'm not too
> surprised...an extra context switch before handling any events.)

So back to this..

> 2) a bunch of threads read and handle events independently. (your
> current model.)

BTW: How does this model somehow exempt itself from said context switching
issue of the former?

> Did you also tried the so-called "leader/follower" model, in which the
> thread which does the polling handles the first event and puts the rest
> on a queue; another thread takes over polling if otherwise idle while
> the first thread is still working. My impression this was a widely
> favored model, though I don't know the details of where each performs best.

Something about this just seems like smoke and mirrors to me. At the end of
the day we still only have a finite amount of CPU cores available to us and
any amount of playing with the order of things is not going to extract any
magical *more* throughput out of a given box. Yes, some of these methods
influence recv/send buffers and have a cascading effect on overall throughput,
but efficient code and algorithms are going to make the real difference - not
goofy thread games.

(and this is coming from someone who *likes* comp.programming.threads)

-cl
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Scott Lamb
Marc Lehmann wrote:
> On Sun, Nov 04, 2007 at 05:04:25PM -0800, Scott Lamb <[EMAIL PROTECTED]> 
> wrote:
>>> * multiple watchers can wait for the same event, there is no limitation
>>>   to one or two watchers for signals and io.
>> Could you give me an example of where that is important?
> 
> Mostly in environments using some form garbage collection. For example,
> this idiom is common in Perl:
> 
>$event = EV::io $fd, ...
> 
> If $event happens to contain an old watcher and $fd happens to refer to
> the same fd as that old watcher, this will lead into probelms as the
> watchers are both alive for a short time.
> 
> There is actually a lot of code that relies on this just working, and the
> only other event loop I know that has a problem with this is Tk.

Ugh, I'd argue that idiom is broken. But if the support's free, I guess
it doesn't matter.

>>> * there are two types of timers, based on real time differences and wall
>>>   clock time (cron-like). timers can also be repeating and be reset at
>>>   almost no cost (for idle timeouts used by many network servers). time 
>>> jumps
>>>   get detected reliably in both directions with or without a monotonic 
>>> clock.
>> (See my other mail about Linux's new timerfd facility.)
> 
> (timerfd unfortunately makes little sense for this, as it adds overhead
> but I can't see the compelling advantage, as one will still run into the
> same time jump problems with periodic timers).
> 
>> Nice; repeating and absolute timers have come up several times before, too.
> 
> This was something I always missed in event loops. That is, some event
> loops have one timer type, some the other, but never both.
> 
>>> * timers are managed by a priority queue (O(1) for important operations
>>>   as opposed to O(log n) in libevent, also resulting in much simpler code).
>> In terms of concrete data types, you appear to have used a binary heap?
>> So by "important operations" you mean removal, correct?
> 
> removal: O(log n)
> insertion: O(log n)
> find next: O(1)
> 
>> still O(log n)? The asymptotic behavior is no different, then, as
>> insertion happens at least as often as removal.
> 
> Yes, but:
> 
> a) finding the next timwer is a constant time issue
> b) a red-black tree is more than three times as slow
> 
> (see the updated benchmark at http://libev.schmorp.de/bench.html,
> especially the difference between the first (no timers) and the second
> examples (timers in use))

Ahh, very nice benchmarks.

> 
>>> * I added idle watchers, pid watchers and hook watchers into the event loop,
>>>   as is required for integration of other event-based libraries, without
>>>   having to force the use of some construct around event_loop.
>> Pardon my ignorance, but what are hook watchers?
> 
> if you want to plug-in other event-based libraries into the event loop you
> need to get to be able to hook into the event loop. this is what those
> watcher types provide.
> 
> the alternative would be to write your own event_loop with EV_NONBLOCK, but
> that isn't modular, that is, if you have two difefernt sofwtare modules
> having their own event_loop you *must* use, you lose. prepare/check watchers
> use this problem nicely.
> 
> A number of event loops have them, and they are useful for other things,
> such as transpoarently integrating coroutine packages etc.
> 
> Its not a killer feature, just very very useful in some cases.
> 
>> pid watchers I assume to be a fancy SIGCHLD handler?
> 
> Yes.
> 
>> That's a potentially useful feature, but why would it require a
>> construct around event_loop?
> 
> I don't udnerstand that, there is no construct around event_loop, its handled
> completely seperately.

My question was in response to your "I added idle watchers, pid watchers
and hook watchers into the event loop, as is required for integration of
other event-based libraries, without having to force the use of some
construct around event_loop."

> The reason is exists is allowing to share this potentially unsharable
> resource. For example, poll and select let you do "everything" (with fds),
> but you can of course only have one component per (single-thread) process
> using them, as they are blocking.
> 
> The same thing is true for signals: you can't share them with sigaction, as
> sigaction only allows one user.
> 
> And the same thing is true for sigchld.

Yes, I could see why sharing SIGCHLD would be useful. I was thinking of
this when asking above when you want to have multiple watchers for the
same event, but this was the only example I could think of off-hand, so
it seemed like two features to address the same use case.

>> A few notes:
>>
>> * what is the purpose of EV_COMMON?
> 
> Allowing customised event watchers. If you are concerned, treat it as a an
> internal symbol. Its use is documented here:
> http://cvs.schmorp.de/libev/README.embed
> 
>> From first glance, I'm concerned that it could not be used properly
>> unless libev.so and all callers are compiled with the same flags, which

Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread William Ahern
On Sun, Nov 04, 2007 at 07:56:36PM -0800, William Ahern wrote:
> On Mon, Nov 05, 2007 at 03:29:34AM +0100, Marc Lehmann wrote:
> > On Sun, Nov 04, 2007 at 06:00:56PM -0800, Christopher Layne <[EMAIL 
> > PROTECTED]> wrote:
> 
> > > Which isn't really that big a deal as similar time is spent in the 
> > > present RB
> > > implementation as it is.
> > 
> > No, I still maintain that the RB tree is slower because its rebalancing
> > operations are frequent and very complex. Heap code is trivial. Yes, they
> > have the same asymptotic growth behaviour, but the practical cases are
> > all very far away from infinity, and the hidden C in O(log n) is quite
> > important.
> > 
> 
> RB balancing isn't that complex. Maybe you're thinking of AVL trees?
> 
> The problem with using heaps in network software is you must be careful
> adversaries cannot dictate any of the parameters. Certainly when you're

Ignore this. I'm confusing heaps with hashes


___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread William Ahern
On Mon, Nov 05, 2007 at 03:29:34AM +0100, Marc Lehmann wrote:
> On Sun, Nov 04, 2007 at 06:00:56PM -0800, Christopher Layne <[EMAIL 
> PROTECTED]> wrote:

> > Which isn't really that big a deal as similar time is spent in the present 
> > RB
> > implementation as it is.
> 
> No, I still maintain that the RB tree is slower because its rebalancing
> operations are frequent and very complex. Heap code is trivial. Yes, they
> have the same asymptotic growth behaviour, but the practical cases are
> all very far away from infinity, and the hidden C in O(log n) is quite
> important.
> 

RB balancing isn't that complex. Maybe you're thinking of AVL trees?

The problem with using heaps in network software is you must be careful
adversaries cannot dictate any of the parameters. Certainly when you're
dealing with timers triggered by I/O latencies you've at least theoretically
exposed yourself to complexity attacks. (This is a guess based on intuition;
I haven't yet looked at your code.)

Nice thing about RB trees is you get a cap on worst case performance, smooth
cost spreading (i.e. no rehashing; not that it would be needed here), and
don't have to worry about pathological or malicious scenarios.

Sure you pay a small price. But for peace of mind its well worth it.

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Christopher Layne
On Mon, Nov 05, 2007 at 02:46:36AM +0100, Marc Lehmann wrote:
> On Sun, Nov 04, 2007 at 04:09:08PM -0800, Scott Lamb <[EMAIL PROTECTED]> 
> wrote:
> > Have you seen the new Linux timerfd API? Where available, you can wait
> > for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats
> > heuristics,
>
> How? I still need to detect time jumps. If my ev_periodic is to be scheduled
> every minute, on the minute, and somebody resets the time the timer needs to
> be rescheduled. With timerfd I would need to detetc that and remove/insert
> the timer again.
> 
> (You might have no use for periodics for timeouts, but they are designed
> to solve this very problem :)

timerfd() has good and redundant points... as far as I can tell, it's an
inversion of user<>kernel code that results in the same goal.

http://lwn.net/Articles/245688/

-cl
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Marc Lehmann
On Sun, Nov 04, 2007 at 06:00:56PM -0800, Christopher Layne <[EMAIL PROTECTED]> 
wrote:
> My point was that an event_del() on an event which has been called before it's
> timer has expired *or* an event_add() with a new timer will require 
> reheapifying
> atleast part of the timer heap.

Hmm, I do not see why this would ever be the case. Removing a timer that
hasn't expired yet might actually be much cheaper than removing the one at
the top element because it isn't at the root, so the n in the O(log n) is
potentially much smaller.

> Having an intrinsically sorted collection of elements and then altering
> a value within the middle of said collection before it has sifted to the
> top will require a reheap from that point on.

Not sure what you mean with "re-heap", but the opertaion you describe is the
same O(log n) operation as for removing elements elsewhere in the heap. And
given that you take the latets timer and insert it at that point, removing a
timer that hasn't expired is usually faster.

> Which isn't really that big a deal as similar time is spent in the present RB
> implementation as it is.

No, I still maintain that the RB tree is slower because its rebalancing
operations are frequent and very complex. Heap code is trivial. Yes, they
have the same asymptotic growth behaviour, but the practical cases are
all very far away from infinity, and the hidden C in O(log n) is quite
important.

(Again, please refer to the benchmark at
http://libev.schmorp.de/bench.html which directly contrasts behaviour
of libevent and libev with timers and no timers, especially look at the
difference in runtime when timers are being used).

> I'm all for a binary-heap rather than a RB-tree personally. I think the
> performance will benefit primarily for heap_min() (which is done before every
> re-entry into the event backend to reset the max-wait timer for 
> epoll/poll/select,
> etc).

I thought so, too, until recently but in fact the event loop is run pretty
rarely (except in benchmarks), and if you handle hundreds of clients
within each run (very typical of busy servers), then you can have hundreds
of timer updates, and these do show up in timing measurements.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Marc Lehmann
On Mon, Nov 05, 2007 at 02:42:16AM +0100, Marc Lehmann <[EMAIL PROTECTED]> 
wrote:
> However, libev has an interface (ev_timer_again) that incrementally
> updates the heap. Also, for repeating timers in general, there is no
> removal/insertion but only an incremental update.

Oh, and sorry for always forgetting stuff... the rationale for supporting
this operation is that I think its pretty important to support timers that
get reset all the time without usually firing (e.g. idle read timeouts on
a normally busy tcp connection).

The other rationale is that its trivial to implement with a heap, because you
already have all the code to do it:

  /* incremental timer update in ev_timer_again: */

  w->at = mn_now + w->repeat;
  downheap (timers, timercnt, w->active - 1);

  /* compare with timer removal: */

  timers [w->active - 1] = timers [--timercnt];
  downheap (timers, timercnt, w->active - 1);

In such a case (updating a timer) the event will simply wander down from
current place in the heap to its new place.

I am not sure wether this can be done with an rb-tree (likely), but I am
sure that I do not want to have to maintain the code that does that :)

(In any case, see the timer benchmark for a good comparison of heap vs.
red-black-tree).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Marc Lehmann
On Mon, Nov 05, 2007 at 02:36:27AM +0100, Marc Lehmann <[EMAIL PROTECTED]> 
wrote:
> > > * multiple watchers can wait for the same event, there is no limitation
> > >   to one or two watchers for signals and io.
> > 
> > Could you give me an example of where that is important?
> 
> There is actually a lot of code that relies on this just working, and the
> only other event loop I know that has a problem with this is Tk.

I forgot to mention that the resulting code is likely a tiny bit faster, and
certainly way less complex, then the multiple-case logic employed in some
libevent backends:

   /* detecting wether evread or evwrite are wanted is not shown */
   if (evread != NULL && !(evread->ev_events & EV_PERSIST))
   event_del(evread);
   if (evwrite != NULL && evwrite != evread &&
   !(evwrite->ev_events & EV_PERSIST))
   event_del(evwrite);

this juggling of two slots for read/write didn't feel right. The code to
check which watchers want which events and schedule them in ev basically
looks like this:

  for (w = anfd->head; w; w = ((WL)w)->next)
if ((ev = w->events & events))
  event (EV_A_ (W)w, ev);

Also, some backends do reference counting in libevent, some don't, and I
don't like completely different semantics unless there is a good technical
reason for them (epoll cannot easily detect closed fds, for example, a big
problem, but at least its something that cnanot easily be improved upon).

The goal obviously wasn't to make this ultra-efficient (its a
singly-linked list), but to make it possible on a small scale without
mysteriously failing on some backends.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Christopher Layne
On Mon, Nov 05, 2007 at 02:42:16AM +0100, Marc Lehmann wrote:
> On Sun, Nov 04, 2007 at 05:19:36PM -0800, Christopher Layne <[EMAIL 
> PROTECTED]> wrote:
> > Not to put on my O-face, but binary heap insert is *average* O(1). There
> > should be a performance win for libevent, when it comes to timer checking,
> > as using a heap will also be O(1) for heap_min() - which will benefit 
> > pending
> > timer calculation code. However, early delete w/ pending timer will need 
> > some
> > rigging or tombstone games. I wouldn't be surprised that, in a case where 
> > one
> > is consistently resetting timers (think read/write before x time passes) and
> > re-adding said event, that in the end it will be the same amount of CPU 
> > time.
> 
> No, because a red-black tree is much more complex in management, so even if
> both operations are O(log n), the heap usually wins hands down.
> 
> Both insertion and removal are of the same complexity, on average, in a
> heap, of the data is random.
> 
> However, libev has an interface (ev_timer_again) that incrementally
> updates the heap. Also, for repeating timers in general, there is no
> removal/insertion but only an incremental update.

Right, which is not due to an inherent advantage of heap vs rbtree - but due
to our luck in time always going in one direction. I believe similar code was
present in libevent as it was. This in itself should be no different.

> Regarding pending events, this is solved in the same way for all events
> (not unlike how libevent does it): There is only one place where a pending
> event can be, and that is on its associated pending list.
> 
> When an event gets stopped, and is found pending, it will be removed form
> the pending queue by nulling out its pointer.

My point was that an event_del() on an event which has been called before it's
timer has expired *or* an event_add() with a new timer will require reheapifying
atleast part of the timer heap. Having an intrinsically sorted collection of
elements and then altering a value within the middle of said collection before
it has sifted to the top will require a reheap from that point on.

Which isn't really that big a deal as similar time is spent in the present RB
implementation as it is.

I'm all for a binary-heap rather than a RB-tree personally. I think the
performance will benefit primarily for heap_min() (which is done before every
re-entry into the event backend to reset the max-wait timer for 
epoll/poll/select,
etc).

-cl
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Marc Lehmann
On Sun, Nov 04, 2007 at 04:09:08PM -0800, Scott Lamb <[EMAIL PROTECTED]> wrote:
> Have you seen the new Linux timerfd API? Where available, you can wait
> for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats
> heuristics,
   
How? I still need to detect time jumps. If my ev_periodic is to be scheduled
every minute, on the minute, and somebody resets the time the timer needs to
be rescheduled. With timerfd I would need to detetc that and remove/insert
the timer again.

(You might have no use for periodics for timeouts, but they are designed
to solve this very problem :)

Besides, having a syscall per timer (or even timer change) would be an
enourmous overhead for many workloads.

> and detecting time jumps sound like introducing a lot of extra
> timeouts.

I don't quite see how that would happen with either timer event currently in
libev, unless the user code forces it.

> I'd hate to see libev(ent)? show up on PowerTOP after just getting rid
> of the 5-second timeout.

Now idea what powertop is, but sporiadic servers might use a lot of cpu
without the kernel ever realising it :)

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Marc Lehmann
On Sun, Nov 04, 2007 at 05:19:36PM -0800, Christopher Layne <[EMAIL PROTECTED]> 
wrote:
> Not to put on my O-face, but binary heap insert is *average* O(1). There
> should be a performance win for libevent, when it comes to timer checking,
> as using a heap will also be O(1) for heap_min() - which will benefit pending
> timer calculation code. However, early delete w/ pending timer will need some
> rigging or tombstone games. I wouldn't be surprised that, in a case where one
> is consistently resetting timers (think read/write before x time passes) and
> re-adding said event, that in the end it will be the same amount of CPU time.

No, because a red-black tree is much more complex in management, so even if
both operations are O(log n), the heap usually wins hands down.

Both insertion and removal are of the same complexity, on average, in a
heap, of the data is random.

However, libev has an interface (ev_timer_again) that incrementally
updates the heap. Also, for repeating timers in general, there is no
removal/insertion but only an incremental update.

Regarding pending events, this is solved in the same way for all events
(not unlike how libevent does it): There is only one place where a pending
event can be, and that is on its associated pending list.

When an event gets stopped, and is found pending, it will be removed form
the pending queue by nulling out its pointer.

The heap insertion/removal is trivial.

(Most of the benchmark differences are, in fact, due to the heap vs.
rb-tree).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Marc Lehmann
On Sun, Nov 04, 2007 at 05:04:25PM -0800, Scott Lamb <[EMAIL PROTECTED]> wrote:
> > * multiple watchers can wait for the same event, there is no limitation
> >   to one or two watchers for signals and io.
> 
> Could you give me an example of where that is important?

Mostly in environments using some form garbage collection. For example,
this idiom is common in Perl:

   $event = EV::io $fd, ...

If $event happens to contain an old watcher and $fd happens to refer to
the same fd as that old watcher, this will lead into probelms as the
watchers are both alive for a short time.

There is actually a lot of code that relies on this just working, and the
only other event loop I know that has a problem with this is Tk.

> > * there are two types of timers, based on real time differences and wall
> >   clock time (cron-like). timers can also be repeating and be reset at
> >   almost no cost (for idle timeouts used by many network servers). time 
> > jumps
> >   get detected reliably in both directions with or without a monotonic 
> > clock.
> 
> (See my other mail about Linux's new timerfd facility.)

(timerfd unfortunately makes little sense for this, as it adds overhead
but I can't see the compelling advantage, as one will still run into the
same time jump problems with periodic timers).

> Nice; repeating and absolute timers have come up several times before, too.

This was something I always missed in event loops. That is, some event
loops have one timer type, some the other, but never both.

> > * timers are managed by a priority queue (O(1) for important operations
> >   as opposed to O(log n) in libevent, also resulting in much simpler code).
> 
> In terms of concrete data types, you appear to have used a binary heap?
> So by "important operations" you mean removal, correct?

removal: O(log n)
insertion: O(log n)
find next: O(1)

> still O(log n)? The asymptotic behavior is no different, then, as
> insertion happens at least as often as removal.

Yes, but:

a) finding the next timwer is a constant time issue
b) a red-black tree is more than three times as slow

(see the updated benchmark at http://libev.schmorp.de/bench.html,
especially the difference between the first (no timers) and the second
examples (timers in use))

> > * I added idle watchers, pid watchers and hook watchers into the event loop,
> >   as is required for integration of other event-based libraries, without
> >   having to force the use of some construct around event_loop.
> 
> Pardon my ignorance, but what are hook watchers?

if you want to plug-in other event-based libraries into the event loop you
need to get to be able to hook into the event loop. this is what those
watcher types provide.

the alternative would be to write your own event_loop with EV_NONBLOCK, but
that isn't modular, that is, if you have two difefernt sofwtare modules
having their own event_loop you *must* use, you lose. prepare/check watchers
use this problem nicely.

A number of event loops have them, and they are useful for other things,
such as transpoarently integrating coroutine packages etc.

Its not a killer feature, just very very useful in some cases.

> pid watchers I assume to be a fancy SIGCHLD handler?

Yes.

> That's a potentially useful feature, but why would it require a
> construct around event_loop?

I don't udnerstand that, there is no construct around event_loop, its handled
completely seperately.

The reason is exists is allowing to share this potentially unsharable
resource. For example, poll and select let you do "everything" (with fds),
but you can of course only have one component per (single-thread) process
using them, as they are blocking.

The same thing is true for signals: you can't share them with sigaction, as
sigaction only allows one user.

And the same thing is true for sigchld.

If your event loop provides support for it, you will less likely run into
a situation where two sofwtare packages in the same process need access to
it and stomp over each other.

> > * the backends use a much simpler design. unlike in libevent, the code to
> >   handle events is not duplicated for each backend, backends deal only
> >   with file descriptor events and a single timeout value, everything else
> >   is handled by the core, which also optimises state changes (the epoll
> >   backend is 100 lines in libev, as opposed to >350 lines in libevent,
> >   without suffering from its limitations).
> 
> Nice.

And while investigating the WIN32-Code/win32.c libevent backend, I found
out that its just a glorified variant of the select backend, except its
O(n) in registering and deregistering.

> > As for compatibility, the actual libev api is very different to the
> > libevent API (although the design is similar), but there is a emulation
> > layer with a corresponding event.h file that supports the event library
> > (but no evbuffer, evnds, evhttp etc.).
> 
> I think the API needs more hashing out. It is...different...but I'm not
> sure it's necessarily

Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Christopher Layne
On Sun, Nov 04, 2007 at 05:04:25PM -0800, Scott Lamb wrote:
> > * timers are managed by a priority queue (O(1) for important operations
> >   as opposed to O(log n) in libevent, also resulting in much simpler code).
> 
> In terms of concrete data types, you appear to have used a binary heap?
> So by "important operations" you mean removal, correct? Insertion is
> still O(log n)? The asymptotic behavior is no different, then, as
> insertion happens at least as often as removal.

Not to put on my O-face, but binary heap insert is *average* O(1). There
should be a performance win for libevent, when it comes to timer checking,
as using a heap will also be O(1) for heap_min() - which will benefit pending
timer calculation code. However, early delete w/ pending timer will need some
rigging or tombstone games. I wouldn't be surprised that, in a case where one
is consistently resetting timers (think read/write before x time passes) and
re-adding said event, that in the end it will be the same amount of CPU time.

-cl
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Scott Lamb
Marc Lehmann wrote:
> Hi!
> 
> On tuesday, I sent mail about various problems with libevent and its
> current API as well as implementation. Unfortunately, the mail has not yet
> shown up, but fortunately, it has been superseded by this one :)
> 
> In that mail, I announced that I will work on the problems I encountered
> in libevent (many of which have been reported and discusssed earlier on
> this list). After analyzing libevent I decided that it wasn't fixable
> except by rewriting the core parts of it (the inability to have multiple
> watchers for the same file descriptor event turned out to be blocking for
> my applications, otherwise I wouldn't have started the effort in the first
> place...).
> 
> The results look promising so far: I additionally implemented a libevent
> compatibility layer and benchmarked both libraries using the benchmark
> program provided by libevent: http://libev.schmorp.de/bench.html
> 
> Here is an incomplete list of what I changed and added (see the full
> list at http://cvs.schmorp.de/libev/README, or the cvs repository at
> http://cvs.schmorp.de/libev/):
> 
> fixed or improved:
> 
> * multiple watchers can wait for the same event, there is no limitation
>   to one or two watchers for signals and io.

Could you give me an example of where that is important?

> * there is full support for fork, you can continue to use the event loop
>   in the parent and child (or just one of them), even with quirky backends
>   such as epoll.

Nice; seems like that's come up on the list several times.

> * there are two types of timers, based on real time differences and wall
>   clock time (cron-like). timers can also be repeating and be reset at
>   almost no cost (for idle timeouts used by many network servers). time jumps
>   get detected reliably in both directions with or without a monotonic clock.

(See my other mail about Linux's new timerfd facility.)

Nice; repeating and absolute timers have come up several times before, too.

> * timers are managed by a priority queue (O(1) for important operations
>   as opposed to O(log n) in libevent, also resulting in much simpler code).

In terms of concrete data types, you appear to have used a binary heap?
So by "important operations" you mean removal, correct? Insertion is
still O(log n)? The asymptotic behavior is no different, then, as
insertion happens at least as often as removal.

> * event watchers can be added and removed at any time (in libevent,
>   removing events that are pending can lead to crashes).

(They don't lead to crashes, as someone mentioned.)

> * different types of events use different watchers, so you don't have
>   to use an i/o event watcher for timeouts, and you can reset timers
>   seperately from other types of watchers. Also, watchers are much smaller
>   (even the libevent emulation watcher only has about 2/3 of the size of a
>   libevent watcher).

Nice; separate types can be nice for documentation purposes if nothing else.

> 
> * I added idle watchers, pid watchers and hook watchers into the event loop,
>   as is required for integration of other event-based libraries, without
>   having to force the use of some construct around event_loop.

Pardon my ignorance, but what are hook watchers?

pid watchers I assume to be a fancy SIGCHLD handler? That's a
potentially useful feature, but why would it require a construct around
event_loop?

> * the backends use a much simpler design. unlike in libevent, the code to
>   handle events is not duplicated for each backend, backends deal only
>   with file descriptor events and a single timeout value, everything else
>   is handled by the core, which also optimises state changes (the epoll
>   backend is 100 lines in libev, as opposed to >350 lines in libevent,
>   without suffering from its limitations).

Nice.

> As for compatibility, the actual libev api is very different to the
> libevent API (although the design is similar), but there is a emulation
> layer with a corresponding event.h file that supports the event library
> (but no evbuffer, evnds, evhttp etc.).

I think the API needs more hashing out. It is...different...but I'm not
sure it's necessarily better, and I don't like change for change's sake.
A few notes:

* what is the purpose of EV_COMMON? From first glance, I'm concerned
that it could not be used properly unless libev.so and all callers are
compiled with the same flags, which seems impractical if the library
ever gains wide use.

* on ev_once failure, you're calling the callback with EV_ERROR? Yuck.
That's quite surprising behavior, and I could see it leading to stack
overflows as each ev_once tries to issue another one.

* What's your use case for ev_loop_new() and ev_loop_default()'s bitmask
of allowed implementations?

* (again, just skimming) you're closing fds automatically on ENOMEM?
Ergh. That seems rather undesirable for many applications.

Cheers,
Scott
___
Libevent-users mailing list
Libevent-

Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread William Ahern
On Sun, Nov 04, 2007 at 03:18:42PM -0800, Steven Grimm wrote:
> You've just pretty accurately described my initial implementation of  
> thread support in memcached. It worked, but it was both more CPU- 
> intensive and had higher response latency (yes, I actually measured  
> it) than the model I'm using now. The only practical downside of my  
> current implementation is that when there is only one UDP packet  
> waiting to be processed, some CPU time is wasted on the threads that  
> don't end up winning the race to read it. But those threads were idle  
> at that instant anyway (or they wouldn't have been in a position to  
> wake up) so, according to my benchmarking, there doesn't turn out to  
> be an impact on latency. And though I am wasting CPU cycles, my total  
> CPU consumption still ends up being lower than passing messages around  
> between threads.
> 

Is this on Linux? They addressed the stampeding herd problem years ago. If
you dig deep down in the kernel you'll see their waitq implemention for
non-blocking socket work (and lots of other stuff). Only one thread is ever
woken per event.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Scott Lamb
Steven Grimm wrote:
> On Nov 4, 2007, at 3:07 PM, Christopher Layne wrote:
>> The issue in itself is having multiple threads monitor the *same* fd
>> via any
>> kind of wait mechanism. It's short circuiting application layers, so
>> that a
>> thread (*any* thread in that pool) can immediately process new data. I
>> think
>> it would be much more structured, less complex (i.e. better
>> performance in
>> the long run anyways), and a cleaner design to have a set number (or even
>> 1) thread handle the "controller" task of tending to new network events,
>> push them onto a per-connection PDU queue, or pre-process in some form or
>> fashion, condsig, and let previously mentioned thread pool handle it
>> in an
>> ordered fashion.
> 
> You've just pretty accurately described my initial implementation of
> thread support in memcached. It worked, but it was both more
> CPU-intensive and had higher response latency (yes, I actually measured
> it) than the model I'm using now. The only practical downside of my
> current implementation is that when there is only one UDP packet waiting
> to be processed, some CPU time is wasted on the threads that don't end
> up winning the race to read it. But those threads were idle at that
> instant anyway (or they wouldn't have been in a position to wake up) so,
> according to my benchmarking, there doesn't turn out to be an impact on
> latency. And though I am wasting CPU cycles, my total CPU consumption
> still ends up being lower than passing messages around between threads.
> 
> It wasn't what I expected; I was fully confident at first that the
> thread-pool, work-queue model would be the way to go, since it's one
> I've implemented in many applications in the past. But the numbers said
> otherwise.

Thanks for the case study. To rephrase (hopefully correctly), you tried
these two models:

1) one thread polls and puts events on a queue; a bunch of other threads
pull from the queue. (resulted in high latency, and I'm not too
surprised...an extra context switch before handling any events.)

2) a bunch of threads read and handle events independently. (your
current model.)

Did you also tried the so-called "leader/follower" model, in which the
thread which does the polling handles the first event and puts the rest
on a queue; another thread takes over polling if otherwise idle while
the first thread is still working. My impression this was a widely
favored model, though I don't know the details of where each performs best.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] announcing libev, towards a faster and more featureful libevent

2007-11-04 Thread Scott Lamb
Marc Lehmann wrote:
>>> * there are two types of timers, based on real time differences and wall
>>>   clock time (cron-like). timers can also be repeating and be reset at
>>>   almost no cost (for idle timeouts used by many network servers). time 
>>> jumps
>>>   get detected reliably in both directions with or without a monotonic 
>>> clock.
>> But then they're not truly "real-time", no?
> 
> Within the limits of technology, they are:
> 
> - timers (based on monotonic time) will time out after "n" seconds (whatever
>   was configured), even if the date resets in between (libevent can do that
>   only for backward time jumps).
> 
> - periodicals will simply be rescheduled, if a periodic timer is scheduled
>   to fire "at" some point then it will not be affected by the time jump,
>   it will still fire at that point (its more complicated with periodic
>   timers scheduled to repeat, if you schedule a periodic timer to execute
>   on every minute than libev will try to schedule it to occur when time()
>   % 60 == 0, regardless of any time jumps.
> 
> Of course, detecting and correcting this cnanot be done completely
> reliable with sub-second precision (there is no API in posix to do that),
> but with a monotonic clock, libev should manage quite fine to detect even
> very small time jumps caused by ntpd.
> 
> (With no monotonic clock its a heuristic, obviously).

Have you seen the new Linux timerfd API? Where available, you can wait
for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats
heuristics, and detecting time jumps sound like introducing a lot of
extra timeouts. I'd hate to see libev(ent)? show up on PowerTOP after
just getting rid of the 5-second timeout.
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Steven Grimm

On Nov 4, 2007, at 3:07 PM, Christopher Layne wrote:
The issue in itself is having multiple threads monitor the *same* fd  
via any
kind of wait mechanism. It's short circuiting application layers, so  
that a
thread (*any* thread in that pool) can immediately process new data.  
I think
it would be much more structured, less complex (i.e. better  
performance in
the long run anyways), and a cleaner design to have a set number (or  
even
1) thread handle the "controller" task of tending to new network  
events,
push them onto a per-connection PDU queue, or pre-process in some  
form or
fashion, condsig, and let previously mentioned thread pool handle it  
in an

ordered fashion.


You've just pretty accurately described my initial implementation of  
thread support in memcached. It worked, but it was both more CPU- 
intensive and had higher response latency (yes, I actually measured  
it) than the model I'm using now. The only practical downside of my  
current implementation is that when there is only one UDP packet  
waiting to be processed, some CPU time is wasted on the threads that  
don't end up winning the race to read it. But those threads were idle  
at that instant anyway (or they wouldn't have been in a position to  
wake up) so, according to my benchmarking, there doesn't turn out to  
be an impact on latency. And though I am wasting CPU cycles, my total  
CPU consumption still ends up being lower than passing messages around  
between threads.


It wasn't what I expected; I was fully confident at first that the  
thread-pool, work-queue model would be the way to go, since it's one  
I've implemented in many applications in the past. But the numbers  
said otherwise.


-Steve
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Adrian Chadd
On Sun, Nov 04, 2007, Steven Grimm wrote:

> >Would this be for listen sockets, or for general read/write IO on an  
> >FD?
> 
> Specifically for a mixed TCP- and UDP-based protocol where any thread  
> is equally able to handle an incoming request on the UDP socket, but  
> TCP sockets are bound to particular threads.

Makes sense. Doesn't solaris event ports system handle this? I haven't
checked in depth.

It sounds like something that kqueue could be extended to do relatively
easily.

What about multiple threads blocking on the same UDP socket? Do multiple
threads wake up when IO arrives? Or just one?




Adrian

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Steven Grimm


On Nov 4, 2007, at 3:03 PM, Adrian Chadd wrote:


On Sun, Nov 04, 2007, Steven Grimm wrote:


Now if only there were a way to wake just one thread up when input
arrives on a descriptor being monitored by multiple threads... But
that isn't supported by any of the underlying poll mechanisms as far
as I can tell.


Would this be for listen sockets, or for general read/write IO on an  
FD?


Specifically for a mixed TCP- and UDP-based protocol where any thread  
is equally able to handle an incoming request on the UDP socket, but  
TCP sockets are bound to particular threads.


Unfortunately the vast majority of incoming requests are on the UDP  
socket, too many to handle on one thread.


Before anyone suggests it: a message-passing architecture (one thread  
reads the UDP socket and queues up work for other threads) gave me  
measurably higher request-handling latency than the current setup,  
which works but eats lots of system CPU time because all the threads  
wake up on each UDP packet. It makes sense: the current scheme  
involves fewer context switches for a given request (at least, on the  
thread that ends up handling it), and context switches aren't free.


Ideally I'd love a mode where I could say, "Only trigger one of the  
waiting epoll instances when this descriptor has input available."  
Sort of pthread_cond_signal() semantics, as opposed to the current  
pthread_cond_broadcast() semantics. (Yes, I'm aware that  
pthread_cond_signal() is not *guaranteed* to wake up only one waiting  
thread -- but in practice that's what it does.)


-Steve
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Christopher Layne
On Sun, Nov 04, 2007 at 12:15:56PM -0800, Steven Grimm wrote:
> On Nov 4, 2007, at 8:13 AM, Marc Lehmann wrote:
> >This would create additional loops (event_bases). The difference is  
> >that
> >these cannot handle signals (or child watchers) at all, with the  
> >default loop
> >being the only one to do signal handling.
> 
> This seems like a totally sane approach to me. Having multiple loops  
> is a big performance win for some applications (e.g., memcached in  
> multithreaded mode), so making the behavior a bit more consistent is a  
> good thing.

It's only a performance win when the number of context switches and
cache stomping, as a result of multiple threads cycling within their own
context does not outweigh the "latency" of a model using less or even
1 thread.

Consider a room with 20 people in it and a single door. The goal is to
hand them a football as a new football is dropped off the assembly
line and have them exit the door. You could throw them all a new football
right as it comes off the line and have them immediately rush for the door -
resulting in a log jam that one has to stop tending the assembly line to
handle. You then head back to the line and begin the patterened task of
throwing footballs to workers as fast as you can - only to have the log jam
repeat itself.

The only way to solve this efficiently is to have less people try and exit
the door at once, or add more doors (CPUs).

> Now if only there were a way to wake just one thread up when input  
> arrives on a descriptor being monitored by multiple threads... But  
> that isn't supported by any of the underlying poll mechanisms as far  
> as I can tell.
> 
> -Steve

It isn't typically supported because it's not a particularly useful or
efficient path to head down in the first place.

Thread pools being what they are, incredibly useful and pretty much the de
facto in threaded code, do have their own abstraction limits as well.

Setting up a thread pool, an inherently asynchronous and unordered collection
of contexts, to asynchronously process an ordered stream of data (unless
your protocol has no "sequence", which I doubt), which I presume to somehow
be in the name of performance, is way more complex and troublesome design
than it needs to be. It's anchored somewhat to the "every thread can do
anything" school of thought which has many hidden costs.

The issue in itself is having multiple threads monitor the *same* fd via any
kind of wait mechanism. It's short circuiting application layers, so that a
thread (*any* thread in that pool) can immediately process new data. I think
it would be much more structured, less complex (i.e. better performance in
the long run anyways), and a cleaner design to have a set number (or even
1) thread handle the "controller" task of tending to new network events,
push them onto a per-connection PDU queue, or pre-process in some form or
fashion, condsig, and let previously mentioned thread pool handle it in an
ordered fashion. Having a group of threads listening to the same fd has now
just thrown our football manager out entirely and become a smash-and-grab
for new footballs. There's still the door to get through.

-cl
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Adrian Chadd
On Sun, Nov 04, 2007, Steven Grimm wrote:

> Now if only there were a way to wake just one thread up when input  
> arrives on a descriptor being monitored by multiple threads... But  
> that isn't supported by any of the underlying poll mechanisms as far  
> as I can tell.

Would this be for listen sockets, or for general read/write IO on an FD?




Adrian

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


Re: [Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Steven Grimm

On Nov 4, 2007, at 8:13 AM, Marc Lehmann wrote:
This would create additional loops (event_bases). The difference is  
that
these cannot handle signals (or child watchers) at all, with the  
default loop

being the only one to do signal handling.


This seems like a totally sane approach to me. Having multiple loops  
is a big performance win for some applications (e.g., memcached in  
multithreaded mode), so making the behavior a bit more consistent is a  
good thing.


Now if only there were a way to wake just one thread up when input  
arrives on a descriptor being monitored by multiple threads... But  
that isn't supported by any of the underlying poll mechanisms as far  
as I can tell.


-Steve

___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users


[Libevent-users] sensible thread-safe signal handling proposal

2007-11-04 Thread Marc Lehmann
> On Sat, Nov 03, 2007 at 03:45:39PM -0700, William Ahern <[EMAIL PROTECTED]> 
> wrote:
> > Curious how you managed to do this. Are you checking the process PID on each
> > loop?
> 
> I considered that, but I think its too slow (one also needs to be careful
> that watchers don't change e.g. epoll state until the getpid check is
> done), or at leats I think I don't want that speed hit, no matter what.

After giving signal handling and threads a lot of thought, I came to these
conclusions:

- requiring pthreads or windows mutexes by default is not acceptable,
  but thats the only way to distribute signal events among event loops
  properly, or globally among many threads if signal handling were global.
- the only way to do it without locking is to only allow a single
  loop to handle events.

This is the interface I came up with to manage multiple loops (which I
think makes more sense than the interface currently in libevent):

   struct ev_loop *ev_default_loop (int methods);
   void ev_default_destroy (void);
   void ev_default_fork (void);

this would create "the default" loop (event_base). ev_default_loop
would always create the same loop, and it would be the one to use for
third-party libraries in general, too. The fork method can be called in
the parent or child (or even in both, or without forking), and it would
destroy and recreate the kernel state but keep all the watchers for the
default loop.

   struct ev_loop *ev_loop_new (int methods);
   void ev_loop_destroy (EV_P);
   void ev_loop_fork (EV_P);

This would create additional loops (event_bases). The difference is that
these cannot handle signals (or child watchers) at all, with the default loop
being the only one to do signal handling.

This would be consistent with how signals are usually handled in a pthreads
environment: block signals in all threads and in one thread handle them all
(sigwait, or using the default mainloop).

No locking inside libevent would be required this way.

I'll implement this in my libev replacement code, unless somebody else comes
up with a better idea.

One such idea that isn't better, but different, would be to require the
user to provide mutex support, such as in ev_init_locking (size, init_cb,
lock_cb, unlock_cb, free_cb) or similar, then use locking and let any
event loop handle the signals and distribute signal events to the relevant
loops. But I am not sure how much locking would be required and I assume
it would be a lot, as one would need to handle the case where one thread
handles a signal for an event_base currently in use by another thread.

Looking at the code in libevent, it seems that signals get handled by
whatever loop was started last, so signal handling is not reliable at all
unless one registers the signal handlers in all threads, which is hard to
do in a thread-safe manner (for the user code).

Having a deterministic model where one loop handles all that would definitely
an improvement over this.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  [EMAIL PROTECTED]
  -=/_/_//_/\_,_/ /_/\_\
___
Libevent-users mailing list
Libevent-users@monkey.org
http://monkey.org/mailman/listinfo/libevent-users