On 10 September 2014 11:17, Pekka Paalanen <[email protected]> wrote:
> On Tue, 9 Sep 2014 13:10:48 +0200 > Marek Chalupa <[email protected]> wrote: > > > On 4 September 2014 15:17, Pekka Paalanen <[email protected]> wrote: > > > > > On Fri, 29 Aug 2014 11:21:30 +0200 > > > Marek Chalupa <[email protected]> wrote: > > > > > > > When wl_display_read_events() returns with errno == EAGAIN, we > > > > naturally try to call it again. But this next call results in > > > > deadlock. > > > > > > > > Signed-off-by: Marek Chalupa <[email protected]> > > > > --- > > > > tests/display-test.c | 88 > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > 1 file changed, 88 insertions(+) > > > > > > > > diff --git a/tests/display-test.c b/tests/display-test.c > > > > index 1289866..e1cf830 100644 > > > > --- a/tests/display-test.c > > > > +++ b/tests/display-test.c > > > > @@ -32,6 +32,7 @@ > > > > #include <sys/types.h> > > > > #include <sys/stat.h> > > > > #include <pthread.h> > > > > +#include <poll.h> > > > > > > > > #include "wayland-private.h" > > > > #include "wayland-server.h" > > > > @@ -480,6 +481,93 @@ TEST(threading_cancel_read_tst) > > > > display_destroy(d); > > > > } > > > > > > > > +/* set this to 1 when it is right that > > > > + * the thread is woken up */ > > > > +static int threads_can_be_woken_up = 0; > > > > + > > > > +static void * > > > > +thread_eagain_func(void *data) { > > > > + struct client *c = data; > > > > + > > > > + register_reading(c->wl_display); > > > > + > > > > + c->display_stopped = 1; > > > > + assert(wl_display_read_events(c->wl_display) == 0 > > > > + && errno != EAGAIN); > > > > + /* if thread was sleeping in read_events, this assert > > > > + * must pass now */ > > > > + assert(threads_can_be_woken_up && "Thread woke up too > > > > early"); + > > > > + pthread_exit(NULL); > > > > +} > > > > + > > > > +static void > > > > +threading_read_eagain(void) > > > > +{ > > > > + struct client *c = client_connect(); > > > > + pthread_t th1, th2, th3; > > > > + struct pollfd fds; > > > > + int i; > > > > + > > > > + register_reading(c->wl_display); > > > > + > > > > + th1 = create_thread(c, thread_eagain_func); > > > > + th2 = create_thread(c, thread_eagain_func); > > > > + th3 = create_thread(c, thread_eagain_func); > > > > + > > > > + /* All the threads are sleeping, waiting until read or > > > > cancel > > > > + * is called. Since we have no data on socket waiting, > > > > + * the wl_connection_read should end up with error and set > > > > errno > > > > + * to EAGAIN => wl_display_read_events() will return 0 and > > > > set > > > errno > > > > + * to EAGAIN. It should not wake up the sleeping threads, > > > > so that > > > > + * this thread can try to read again */ > > > > + alarm(3); > > > > + assert(wl_display_read_events(c->wl_display) == 0); > > > > + assert(errno == EAGAIN); > > > > > > I think there is a problem here. wl_display_read_events() returning > > > 0 means success, and in that case errno may not be set if it was a > > > real success. There is no EAGAIN to be handled, it's already done > > > inside read_events(). > > > > > > In this test it won't be a real success, but you cannot code > > > clients to check for errno after getting 0 from > > > wl_display_read_events(). > > > > > > > > > + > > > > + /* the same should happen next times - but there was a bug > > > > when > > > > + * the main thread got blocked! */ > > > > + for (i = 0; i < 5; ++i) { > > > > + assert(wl_display_read_events(c->wl_display) == 0); > > > > > > Therefore you are here calling wl_display_read_events() again after > > > it succeeded! And that IMO is a developer error. > > > > > > > yes > > > > > > > Besides, I cannot see anywhere where it says that it is ok to call > > > wl_display_read_events() again even if it failed, without calling > > > wl_display_prepare_read() first. > > > > > > > > Since original piece of code returned 0 on EAGAIN, then I suppose the > > author considered this as a success, so yes, wl_display_read_events() > > is not probably meant > > to be called again. But consider this: > > if the thread that got EAGAIN (the only thread that is not sleeping) > > calls wl_display_prepare_read() > > + wl_display_read_events() again, it works. Problem is that it does > > not have to call it > > right after the failed wl_display_read_events and the other threads > > are still sleeping. > > EAGAIN is not a failure here. It is only a spurious wakeup, and all > threads should go back sleeping on whatever they need to sleep on, > usually poll(). > > > Moreover, if there are not events on the socket for some time, this > > one thread is looping in cycle > > while the other threads are sleeping. > > It's not a busy-loop, because the thread would be sleeping in poll() or > equivalent. > > But yes, I see that the other threads sleeping inside their > read_events() are not woken up, so they do not do a cycle. I'm not sure > if that is a problem or not. I suppose waking everyone up would not > hurt, but not waking up would risk deadlocking if the thread with > EAGAIN never read again. > > > What I meant to do was to give the programmer the choice whether to > > loop or let > > all threads try it again (cancel read). But as you said before, 0 is > > wrong return value for that. > > See end of the e-mail for conclusions. > > Yeah, it seems to be wrong to require cancel_read() after calling > read_events(), no matter whether it succeeded or not. > > > Summing it up, in my point of view we have these options: > > > > * Let it as it is: > > this can lead to starving of other threads for some time. > > Moreover, if the thread that > > got EAGAIN decides to exit (or simply not to read again) instead > > of another trial to read, the other threads will block. > > Yup, I agree with the analysis here. > > > * return -1 and set EAGAIN > > programmer can either try to read again, or cancel read and let > > all threads do another cycle. Good thing is > > that programmer knows that there was an error and can cancel read > > in the case that the thread is exiting > > instead of another cycle. > > EAGAIN is not an error, which would make this solution an ABI break. > > > * wake up threads on EAGAIN > > this solution will force all threads to do another cycle (or > > whatever they want) and should not have any side effects. It can lead > > to big overhead if there are no events on the socket for some time > > (waking up threads again and again in cycle) > > I think we should do this. I don't really see the overhead at all, > since every thread should have poll() in its loop, so they will not be > busy-looping. > If there is a poll, then the overhead is solved :) > > After all, the sleep inside read_events() is only for synchronizing > reading threads, not for waiting new data to appear. > > EAGAIN in read_events() is a success, even though there was nothing > received. On success (and on failure!) all threads sleeping inside > read_events() should be woken up. IOW, whatever thread hits the > reader_count==0 case in read_events() should *always* end up calling > display_wakeup_threads() before returning. > > Does this make sense? > It does and I agree this is the most safe solution. I'll rebase the old http://lists.freedesktop.org/archives/wayland-devel/2014-August/016932.html that was created exactly because of this reasoning and send it here. > > In fact, I think wl_display_read_event() failing on last_error should > be equivalent to calling wl_display_cancel_read(). It would be the same > if the app manually first checked for error and then canceled. > > Sounds reasonable, I'll add a patch for it to the previous one. > > Thanks, > pq > Thanks, Marek
_______________________________________________ wayland-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/wayland-devel
