* Peter Maydell (peter.mayd...@linaro.org) wrote: > On 12 November 2015 at 12:04, Dr. David Alan Gilbert > <dgilb...@redhat.com> wrote: > > * Peter Maydell (peter.mayd...@linaro.org) wrote: > >> On 10 November 2015 at 14:25, Juan Quintela <quint...@redhat.com> wrote: > >> > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > >> > > >> > When transmitting RAM pages, consume pages that have been queued by > >> > MIG_RPCOMM_REQPAGE commands and send them ahead of normal page scanning. > >> > > >> > Note: > >> > a) After a queued page the linear walk carries on from after the > >> > unqueued page; there is a reasonable chance that the destination > >> > was about to ask for other closeby pages anyway. > >> > > >> > b) We have to be careful of any assumptions that the page walking > >> > code makes, in particular it does some short cuts on its first linear > >> > walk that break as soon as we do a queued page. > >> > > >> > c) We have to be careful to not break up host-page size chunks, since > >> > this makes it harder to place the pages on the destination. > >> > > >> > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> > >> > Reviewed-by: Juan Quintela <quint...@redhat.com> > >> > Signed-off-by: Juan Quintela <quint...@redhat.com> > >> > >> I've just discovered that this is causing 'make check' failures on > >> my OSX host (unfortunately something in my setup is causing > >> 'make check' failures to not always cause a build failure, so I > >> didn't notice earlier): > > > > It's only failing on OSX? Every time or only sometimes? > > Only OSX, and always. I think OSX is pickier about mutexes really > needing to be initialized before use.
OK, at least an 'always' should be easier to debug. > > If you can find a way to get a backtrace off that qemu_mutex_lock case > > that would be great; I'd assume the later errors are the fall out from that. > > I'll have a look after lunch, but it's usually painful to get a > backtrace out of this kind of qtest, because it's clearly starting > a whole pile of QEMUs and there's no way I know of to say "only > run a few of these tests, not the whole huge pile". You could add an abort/assert into util/qemu-thread-posix.c qemu_mutex_lock in the error path. Could you also add: diff --git a/migration/migration.c b/migration/migration.c index 9bd2ce7..85e5766 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -93,6 +93,7 @@ MigrationState *migrate_get_current(void) }; if (!once) { + fprintf(stderr,"migrate_get_current do init of current_migration %d\n", getpid()); qemu_mutex_init(¤t_migration.src_page_req_mutex); once = true; } diff --git a/migration/ram.c b/migration/ram.c index 4266687..72b46f2 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1036,6 +1036,7 @@ static RAMBlock *unqueue_page(MigrationState *ms, ram_addr_t *offset, { RAMBlock *block = NULL; + fprintf(stderr,"unqueue_page %d\n", getpid()); qemu_mutex_lock(&ms->src_page_req_mutex); if (!QSIMPLEQ_EMPTY(&ms->src_page_requests)) { struct MigrationSrcPageRequest *entry = and make sure that the init happens before the first unqueue (you'll get loads of calls to unqueue). Dave > > thanks > -- PMM -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK