Re: [RFC PATCH v2 3/8] migration/multifd: Terminate the TLS connection
On Fri, Feb 07, 2025 at 03:15:48PM -0300, Fabiano Rosas wrote: > >> +for (i = 0; i < migrate_multifd_channels(); i++) { > >> +MultiFDSendParams *p = &multifd_send_state->params[i]; > >> + > >> +/* thread_created implies the TLS handshake has succeeded */ > >> +if (p->tls_thread_created && p->thread_created) { > >> +Error *local_err = NULL; > >> +/* > >> + * The destination expects the TLS session to always be > >> + * properly terminated. This helps to detect a premature > >> + * termination in the middle of the stream. Note that > >> + * older QEMUs always break the connection on the source > >> + * and the destination always sees > >> + * GNUTLS_E_PREMATURE_TERMINATION. > >> + */ > >> +migration_tls_channel_end(p->c, &local_err); > >> + > >> +if (local_err) { > >> +/* > >> + * The above can fail with broken pipe due to a > >> + * previous migration error, ignore the error. > >> + */ > >> +assert(migration_has_failed(migrate_get_current())); > > > > Considering this is still src, do we want to be softer on this by > > error_report? > > > > Logically !migration_has_failed() means it succeeded, so we can throw src > > qemu way now, that shouldn't be a huge deal. More of thinking out loud kind > > of comment.. Your call. > > > > Maybe even a warning? If at this point migration succeeded, it's probably > best to let cleanup carry on. Yep, warning sounds good too. -- Peter Xu
Re: [RFC PATCH v2 3/8] migration/multifd: Terminate the TLS connection
Peter Xu writes: > On Fri, Feb 07, 2025 at 11:27:53AM -0300, Fabiano Rosas wrote: >> The multifd recv side has been getting a TLS error of >> GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send >> side closes the sockets without ending the TLS session. This has been >> masked by the code not checking the migration error after loadvm. >> >> Start ending the TLS session at multifd_send_shutdown() so the recv >> side always sees a clean termination (EOF) and we can start to >> differentiate that from an actual premature termination that might >> possibly happen in the middle of the migration. >> >> There's nothing to be done if a previous migration error has already >> broken the connection, so add a comment explaining it and ignore any >> errors coming from gnutls_bye(). >> >> This doesn't break compat with older recv-side QEMUs because EOF has >> always caused the recv thread to exit cleanly. >> >> Signed-off-by: Fabiano Rosas > > Reviewed-by: Peter Xu > > One trivial comment.. > >> --- >> migration/multifd.c | 34 +- >> migration/tls.c | 5 + >> migration/tls.h | 2 +- >> 3 files changed, 39 insertions(+), 2 deletions(-) >> >> diff --git a/migration/multifd.c b/migration/multifd.c >> index ab73d6d984..b57cad3bb1 100644 >> --- a/migration/multifd.c >> +++ b/migration/multifd.c >> @@ -490,6 +490,32 @@ void multifd_send_shutdown(void) >> return; >> } >> >> +for (i = 0; i < migrate_multifd_channels(); i++) { >> +MultiFDSendParams *p = &multifd_send_state->params[i]; >> + >> +/* thread_created implies the TLS handshake has succeeded */ >> +if (p->tls_thread_created && p->thread_created) { >> +Error *local_err = NULL; >> +/* >> + * The destination expects the TLS session to always be >> + * properly terminated. This helps to detect a premature >> + * termination in the middle of the stream. Note that >> + * older QEMUs always break the connection on the source >> + * and the destination always sees >> + * GNUTLS_E_PREMATURE_TERMINATION. >> + */ >> +migration_tls_channel_end(p->c, &local_err); >> + >> +if (local_err) { >> +/* >> + * The above can fail with broken pipe due to a >> + * previous migration error, ignore the error. >> + */ >> +assert(migration_has_failed(migrate_get_current())); > > Considering this is still src, do we want to be softer on this by > error_report? > > Logically !migration_has_failed() means it succeeded, so we can throw src > qemu way now, that shouldn't be a huge deal. More of thinking out loud kind > of comment.. Your call. > Maybe even a warning? If at this point migration succeeded, it's probably best to let cleanup carry on. >> +} >> +} >> +} >> + >> multifd_send_terminate_threads(); >> >> for (i = 0; i < migrate_multifd_channels(); i++) { >> @@ -1141,7 +1167,13 @@ static void *multifd_recv_thread(void *opaque) >> >> ret = qio_channel_read_all_eof(p->c, (void *)p->packet, >> p->packet_len, &local_err); >> -if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ >> +if (!ret) { >> +/* EOF */ >> +assert(!local_err); >> +break; >> +} >> + >> +if (ret == -1) { >> break; >> } >> >> diff --git a/migration/tls.c b/migration/tls.c >> index fa03d9136c..5cbf952383 100644 >> --- a/migration/tls.c >> +++ b/migration/tls.c >> @@ -156,6 +156,11 @@ void migration_tls_channel_connect(MigrationState *s, >>NULL); >> } >> >> +void migration_tls_channel_end(QIOChannel *ioc, Error **errp) >> +{ >> +qio_channel_tls_bye(QIO_CHANNEL_TLS(ioc), errp); >> +} >> + >> bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) >> { >> if (!migrate_tls()) { >> diff --git a/migration/tls.h b/migration/tls.h >> index 5797d153cb..58b25e1228 100644 >> --- a/migration/tls.h >> +++ b/migration/tls.h >> @@ -36,7 +36,7 @@ void migration_tls_channel_connect(MigrationState *s, >> QIOChannel *ioc, >> const char *hostname, >> Error **errp); >> - >> +void migration_tls_channel_end(QIOChannel *ioc, Error **errp); >> /* Whether the QIO channel requires further TLS handshake? */ >> bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc); >> >> -- >> 2.35.3 >>
Re: [RFC PATCH v2 3/8] migration/multifd: Terminate the TLS connection
On Fri, Feb 07, 2025 at 11:27:53AM -0300, Fabiano Rosas wrote: > The multifd recv side has been getting a TLS error of > GNUTLS_E_PREMATURE_TERMINATION at the end of migration when the send > side closes the sockets without ending the TLS session. This has been > masked by the code not checking the migration error after loadvm. > > Start ending the TLS session at multifd_send_shutdown() so the recv > side always sees a clean termination (EOF) and we can start to > differentiate that from an actual premature termination that might > possibly happen in the middle of the migration. > > There's nothing to be done if a previous migration error has already > broken the connection, so add a comment explaining it and ignore any > errors coming from gnutls_bye(). > > This doesn't break compat with older recv-side QEMUs because EOF has > always caused the recv thread to exit cleanly. > > Signed-off-by: Fabiano Rosas Reviewed-by: Peter Xu One trivial comment.. > --- > migration/multifd.c | 34 +- > migration/tls.c | 5 + > migration/tls.h | 2 +- > 3 files changed, 39 insertions(+), 2 deletions(-) > > diff --git a/migration/multifd.c b/migration/multifd.c > index ab73d6d984..b57cad3bb1 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -490,6 +490,32 @@ void multifd_send_shutdown(void) > return; > } > > +for (i = 0; i < migrate_multifd_channels(); i++) { > +MultiFDSendParams *p = &multifd_send_state->params[i]; > + > +/* thread_created implies the TLS handshake has succeeded */ > +if (p->tls_thread_created && p->thread_created) { > +Error *local_err = NULL; > +/* > + * The destination expects the TLS session to always be > + * properly terminated. This helps to detect a premature > + * termination in the middle of the stream. Note that > + * older QEMUs always break the connection on the source > + * and the destination always sees > + * GNUTLS_E_PREMATURE_TERMINATION. > + */ > +migration_tls_channel_end(p->c, &local_err); > + > +if (local_err) { > +/* > + * The above can fail with broken pipe due to a > + * previous migration error, ignore the error. > + */ > +assert(migration_has_failed(migrate_get_current())); Considering this is still src, do we want to be softer on this by error_report? Logically !migration_has_failed() means it succeeded, so we can throw src qemu way now, that shouldn't be a huge deal. More of thinking out loud kind of comment.. Your call. > +} > +} > +} > + > multifd_send_terminate_threads(); > > for (i = 0; i < migrate_multifd_channels(); i++) { > @@ -1141,7 +1167,13 @@ static void *multifd_recv_thread(void *opaque) > > ret = qio_channel_read_all_eof(p->c, (void *)p->packet, > p->packet_len, &local_err); > -if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ > +if (!ret) { > +/* EOF */ > +assert(!local_err); > +break; > +} > + > +if (ret == -1) { > break; > } > > diff --git a/migration/tls.c b/migration/tls.c > index fa03d9136c..5cbf952383 100644 > --- a/migration/tls.c > +++ b/migration/tls.c > @@ -156,6 +156,11 @@ void migration_tls_channel_connect(MigrationState *s, >NULL); > } > > +void migration_tls_channel_end(QIOChannel *ioc, Error **errp) > +{ > +qio_channel_tls_bye(QIO_CHANNEL_TLS(ioc), errp); > +} > + > bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) > { > if (!migrate_tls()) { > diff --git a/migration/tls.h b/migration/tls.h > index 5797d153cb..58b25e1228 100644 > --- a/migration/tls.h > +++ b/migration/tls.h > @@ -36,7 +36,7 @@ void migration_tls_channel_connect(MigrationState *s, > QIOChannel *ioc, > const char *hostname, > Error **errp); > - > +void migration_tls_channel_end(QIOChannel *ioc, Error **errp); > /* Whether the QIO channel requires further TLS handshake? */ > bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc); > > -- > 2.35.3 > -- Peter Xu