* Juan Quintela (quint...@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilb...@redhat.com> wrote:
> > * Juan Quintela (quint...@redhat.com) wrote:
> >> We create new channels for each new thread created. We only send through
> >> them a character to be sure that we are creating the channels in the
> >> right order.
> >
> > That text is out of date isn't it?
> 
> oops, fixed.
> 
> 
> >> +gboolean multifd_new_channel(QIOChannel *ioc)
> >> +{
> >> +    int thread_count = migrate_multifd_threads();
> >> +    MultiFDRecvParams *p = g_new0(MultiFDRecvParams, 1);
> >> +    MigrationState *s = migrate_get_current();
> >> +    char string[MULTIFD_UUID_MSG];
> >> +    char string_uuid[UUID_FMT_LEN];
> >> +    char *uuid;
> >> +    int id;
> >> +
> >> +    qio_channel_read(ioc, string, sizeof(string), &error_abort);
> >> +    sscanf(string, "%s multifd %03d", string_uuid, &id);
> >> +
> >> +    if (qemu_uuid_set) {
> >> +        uuid = qemu_uuid_unparse_strdup(&qemu_uuid);
> >> +    } else {
> >> +        uuid = g_strdup(multifd_uuid);
> >> +    }
> >> +    if (strcmp(string_uuid, uuid)) {
> >> +        error_report("multifd: received uuid '%s' and expected uuid '%s'",
> >> +                     string_uuid, uuid);
> >
> > probably worth adding the channel id as well so we can see
> > when it fails.
> 
> Done.
> 
> >> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> >> +                          MIGRATION_STATUS_FAILED);
> >> +        terminate_multifd_recv_threads();
> >> +        return FALSE;
> >> +    }
> >> +    g_free(uuid);
> >> +
> >> +    if (multifd_recv_state->params[id] != NULL) {
> >> +        error_report("multifd: received id '%d' is already setup'", id);
> >> +        migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
> >> +                          MIGRATION_STATUS_FAILED);
> >> +        terminate_multifd_recv_threads();
> >> +        return FALSE;
> >> +    }
> >> +    qemu_mutex_init(&p->mutex);
> >> +    qemu_sem_init(&p->sem, 0);
> >> +    p->quit = false;
> >> +    p->id = id;
> >> +    p->c = ioc;
> >> +    atomic_set(&multifd_recv_state->params[id], p);
> >
> > Can you explain why this is quite so careful about ordering ? Is there
> > something that could look at params or try and take the mutex before
> > the count is incremented?
> 
> what happened to me in the middle stages of the patches (yes, doing
> asynchronously was painful) was that:
> 
> I created the threads (at the beggining I did the
> multifd_recv_state->params[id] == p inside the thread, that makes things
> really, really racy.  I *think* that now we could probably do this
> as you state.
> 
> 
> 
> > I think it's safe to do:
> >  p->quit = false;
> >  p->id = id;
> >  p->c = ioc;
> >  &multifd_recv_state->params[id] = p;
> >  qemu_sem_init(&p->sem, 0);
> >  qemu_mutex_init(&p->mutex);
> >  qemu_thread_create(...)
> >  atomic_inc(&multifd_recv_state->count);    <-- I'm not sure if this
> >  needs to be atomic
> 
> We only change it on the main thread, so it should be enough.  The split
> that I want to do is:
> 
> we do the listen asynchronously
> when something arrives, we just read it (main thread)
> we then read <uuid> <string> <arguments>
> and then after checking that uuid is right, we call whatever function we
> have for "string", in our case "multifd", with <arguments> as one string
> parameters.
> 
> This should make it easier to create new "channels" for other purposes.
> So far so good.
> 
> But then it appears what are the responsabilities, At the beggining, I
> read the string on the reception thread for that channel, that created a
> race because I received the 1st message for that channel before the
> channel was fully created (yes, it only happened sometimes, easy to
> understand after debugging).  This is the main reason that I changed to
> an array of pointers to structs instead of one array of structs.
> 
> Then, I had to ve very careful to know when I had created all the
> channels threads, because otherwise I ended having races left and right.
> 
> I will try to test the ordering that you suggested.
> 
> >> +    qemu_thread_create(&p->thread, "multifd_recv", multifd_recv_thread, p,
> >> +                       QEMU_THREAD_JOINABLE);
> >
> > You've lost the nice numbered thread names you had created in the
> > previous version of this that you're removing.
> 
> I could get them back, but they really were not showing at gdb, where do
> they show? ps?

If you start qemu with -name debug-threads=on they show up in gdb's
info threads
also in top (hit H) and ps if you turn on the right optioa (H as well?)n.

> >> +    multifd_recv_state->count++;
> >> +
> >> +    /* We need to return FALSE for the last channel */
> >> +    if (multifd_recv_state->count == thread_count) {
> >> +        return FALSE;
> >> +    } else {
> >> +        return TRUE;
> >> +    }
> >
> > return multifd_recv_state->count != thread_count;   ?
> 
> For other reasons I change this functions and now they use a different
> way of setting/checking if we have finished.  Look at the new series.
> 
> I didn't do as you said because I feel it weird that we return a bool
> when we expert a gboolean, but .....

I hope & believe they're defined as compatible:
  https://people.gnome.org/~desrt/glib-docs/glib-Standard-Macros.html#TRUE:CAPS

Dave
> Thanks, Juan.
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Reply via email to