On Tue, Jan 10, 2023 at 05:02:58PM -0500, Stefan Berger wrote: > > > On 1/10/23 14:47, Stefan Berger wrote: > > > > > > On 1/10/23 14:27, Daniel P. Berrangé wrote: > > > On Tue, Jan 10, 2023 at 01:50:26PM -0500, Stefan Berger wrote: > > > > > > > > > > > > On 1/6/23 10:16, Stefan Berger wrote: > > > > > This here seems to be the root cause. An unknown control channel > > > > > command was received from the TPM emulator backend by the control > > > > > channel thread and we end up in g_assert_not_reached(). > > > > > > > > > > https://github.com/qemu/qemu/blob/master/tests/qtest/tpm-emu.c#L189 > > > > > > > > > > > > > > > > > > > > ret = qio_channel_read(ioc, (char *)&cmd, sizeof(cmd), > > > > > NULL); > > > > > if (ret <= 0) { > > > > > break; > > > > > } > > > > > > > > > > cmd = be32_to_cpu(cmd); > > > > > switch (cmd) { > > > > > [...] > > > > > default: > > > > > g_debug("unimplemented %u", cmd); > > > > > g_assert_not_reached(); > > > > > <------------------ > > > > > } > > > > > > > > > > I will run this test case in an endless loop on an x86_64 host and > > > > > see what we get there ... > > > > > > > > I could not recreate the issue running the test on a ppc64 and x86_64 > > > > host. There we like >100k test runs on ppc64 and >40k on x86_64. Also > > > > simulating the reception of an unsupported command did not lead to a > > > > hang like shown here. > > > > > > Assuming your ppc64 host is running an little endian OS, and > > > we're only seeing the test failure on s390x, then it points towards > > > the problem being an endianness issue in the TPM code. Something > > > missing a byteswap somewhere along the way ? > > > > Yes, my ppc64 machine is also little endian. If the issue was not an > > intermittent but a permanent > > failure I would look for something like that. I would think it's more some > > sort of initialization > > issue, like a value on the stack that occasionally set to an undesirable > > value -- maybe even in a > > dependency. > > I found I still had access to an s390x machine. ~2700 loops on this test case > so far but nothing... it would be good to be able to recreate the issue and > apply the fix but we'll have to do it without testing then I guess. > > Does this look about right? From my tests with injecting an error it at least > seems to do what it is intended to do. > > diff --git a/tests/qtest/tpm-emu.c b/tests/qtest/tpm-emu.c > index 2994d1cf42..dbc308a572 100644 > --- a/tests/qtest/tpm-emu.c > +++ b/tests/qtest/tpm-emu.c > @@ -36,11 +36,19 @@ void tpm_emu_test_wait_cond(TPMTestState *s) > g_mutex_unlock(&s->data_mutex); > } > > +static void tpm_emu_close_data_ioc(void *ioc) > +{ > + g_debug("CLOSE DATA IOC"); > + qio_channel_close(ioc, NULL); > +} > + > static void *tpm_emu_tpm_thread(void *data) > { > TPMTestState *s = data; > QIOChannel *ioc = s->tpm_ioc; > > + qtest_add_abrt_handler(tpm_emu_close_data_ioc, ioc); > + > s->tpm_msg = g_new(struct tpm_hdr, 1); > while (true) { > int minhlen = sizeof(s->tpm_msg->tag) + sizeof(s->tpm_msg->len); > @@ -77,12 +85,19 @@ static void *tpm_emu_tpm_thread(void *data) > &error_abort); > } > > + qtest_remove_abrt_handler(ioc); > g_free(s->tpm_msg); > s->tpm_msg = NULL; > object_unref(OBJECT(s->tpm_ioc)); > return NULL; > } > > +static void tpm_emu_close_ctrl_ioc(void *ioc) > +{ > + g_debug("CLOSE CTRL IOC"); > + qio_channel_close(ioc, NULL); > +} > + > void *tpm_emu_ctrl_thread(void *data) > { > TPMTestState *s = data; > @@ -119,6 +134,8 @@ void *tpm_emu_ctrl_thread(void *data) > s->emu_tpm_thread = g_thread_new(NULL, tpm_emu_tpm_thread, s); > } > > + qtest_add_abrt_handler(tpm_emu_close_ctrl_ioc, ioc);
I'd suggest this be done before starting tpm_emu_tpm_thread, immediately after the "ioc" is created. > + > while (true) { > uint32_t cmd; > ssize_t ret; > @@ -129,6 +146,9 @@ void *tpm_emu_ctrl_thread(void *data) > } > > cmd = be32_to_cpu(cmd); > + //g_debug("cmd=%u", cmd); > + //if (cmd == 14) > + // cmd = 100; > switch (cmd) { > case CMD_GET_CAPABILITY: { > ptm_cap cap = cpu_to_be64(0x3fff); > @@ -190,6 +210,8 @@ void *tpm_emu_ctrl_thread(void *data) > } > } > > + qtest_remove_abrt_handler(ioc); > + > object_unref(OBJECT(ioc)); > object_unref(OBJECT(lioc)); > return NULL; > > > > > Stefan > > > > > > > > > > > With regards, > > > Daniel > > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|