On 04/26/2017 01:01 PM, Dr. David Alan Gilbert wrote: > * Christian Borntraeger (borntrae...@de.ibm.com) wrote: >> On 04/24/2017 09:12 PM, Dr. David Alan Gilbert wrote: >>> * Christian Borntraeger (borntrae...@de.ibm.com) wrote: >>>> On 04/24/2017 04:35 PM, Dr. David Alan Gilbert wrote: >>>>> * Christian Borntraeger (borntrae...@de.ibm.com) wrote: >>>>>> On 04/24/2017 12:53 PM, Dr. David Alan Gilbert wrote: >>>>>>> * Christian Borntraeger (borntrae...@de.ibm.com) wrote: >>>>>>>> David, Juan, >>>>>>>> >>>>>>>> I can trigger a hang of postcopy migration (when I do it early) so >>>>>>>> that both sides are in paused state. Looks like one thread is still >>>>>>>> loading vmstates for virtio and this load accesses guest memory and >>>>>>>> triggers a userfault. >>>>>>> >>>>>>> It's perfectly legal for the destination to cause a userfault as it >>>>>>> loads the virtio queue - the virtio queue should be being loaded by >>>>>>> the main thread from the buffer while the 'listening' thread is >>>>>>> waiting for the incoming page data. >>>>>>> >>>>>>> Can you turn on the following tracing please: destination: >>>>>>> postcopy_ram_fault_thread_request postcopy_place_page >>>>>>> postcopy_place_page_zero >>>>>>> >>>>>>> source: migrate_handle_rp_req_pages ram_save_queue_pages >>>>>>> >>>>>>> You should see: virtio does the access userfault generates a fault >>>>>>> postcopy_ram_fault_thread_request sends the request to the source >>>>>>> >>>>>>> the source sees migrate_handle_rp_req_pages queues it with >>>>>>> ram_save_queue_pages >>>>>>> >>>>>>> the destination sees the page arrive and postcopy_place_page or >>>>>>> postcopy_place_page_zero >>>>>>> >>>>>>> some of that might disappear if the page was already on it's way. >>>>>> >>>>>> >>>>>> the last event on the source are >>>>>> [..] >>>>>> 58412@1493037953.747988:postcopy_place_page host=0x3ff92246000 >>>>>> 58412@1493037953.747992:postcopy_place_page host=0x3ff92247000 >>>>> >>>>> How do you see those on the source??? >>>> >>>> It was the previous migrate (I did it in a loop forth and back) >>>> The problem happens on the migrate back. >>> >>> OK, good. >>> >>>>> >>>>>> 58412@1493037956.804210:migrate_handle_rp_req_pages in s390.ram at >>>>>> 41d9000 len 1000 >>>>>> 58412@1493037956.804216:ram_save_queue_pages s390.ram: start: 41d9000 >>>>>> len: 100 >>>>>> >>>>> >>>>> Is that a typo? I'm expecting those two 'len' fields to be the same? >>>> >>>> Yes, its a cut'n' paste "miss the last byte" >>> >>> Good. >>> >>> Ok, before digging further, is this a new bug or does it happen on older >>> QEMU? Have you got a rune to reproduce it on x86? >>> >>>> >>>>> >>>>>> On the target a see lots of >>>>>> >>>>>> 39741@1493037958.833710:postcopy_place_page_zero host=0x3ff9befa000 >>>>>> 39741@1493037958.833716:postcopy_place_page host=0x3ff9befb000 >>>>>> 39741@1493037958.833759:postcopy_place_page host=0x3ff9befc000 >>>>>> 39741@1493037958.833818:postcopy_place_page host=0x3ff9befd000 >>>>>> 39741@1493037958.833819:postcopy_place_page_zero host=0x3ff9befe000 >>>>>> 39741@1493037958.833822:postcopy_place_page host=0x3ff9beff000 >>>>>> >>>>>> So we have about 2 seconds of traffic going on after that request, >>>>>> I assume its precopy related. >>>>>> >>>>>> Looking back on the target history there was >>>>>> 39741@1493037956.804337:postcopy_ram_fault_thread_request Request for >>>>>> HVA=3ff618d9000 rb=s390.ram offset=41d9000 >>>>>> >>>>>> In fact it seems to be the first and only request: >>>>>> >>>>>> # cat /tmp/test0.trace | grep -v postcopy_place_page >>>>>> >>>>>> 39741@1493037956.804337:postcopy_ram_fault_thread_request Request for >>>>>> HVA=3ff618d9000 rb=s390.ram offset=41d9000 >>>>> >>>>> OK, does the HVA there correspond to the address that your virtio device >>>>> is blocking on? >>>> >>>> yes it is the same page. >>>> >>>> >>>>> (or at least the start of the page) >>>>> Do you see a postcopy_place_page with a host= the same HVA ? >>>> >>>> no >>> >>> Hmm, that's bad. >>> The flow is: >>> Precopy >>> (a) Send pages >>> Switch to postcopy >>> (b) Send discards for pages that have been changed after >>> (a) >>> Postcopy >>> (c) Keep sending pages until we run out >>> (d) But send pages we're asked for first >>> >>> (d) might be ignored if the source thinks the page was already sent or not >>> dirty. >>> >>> So we need to figure out: >>> 1) If the source sent the pages during (a) >>> 2) If the source discarded it during (b) >>> 3) If it sent it again any time in c/d >>> 4) If it ignored the request from d >>> >>> >>> So please turn on the traces: >>> get_queued_page_not_dirty should help with (4) >>> get_queued_page should help with (4) >>> ram_discard_range should help with (2) >>> loadvm_postcopy_ram_handle_discard should help with (2) >>> qemu_savevm_send_postcopy_ram_discard should help with (2) >>> >>> add near the top of ram_save_page in ram.c: >>> fprintf(stderr, "%s: %s:%zx\n", __func__, block->idstr, (size_t)offset); >>> >>> should help with 1, 3 >> >> >> source: >> 28863@1493200554.418978:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419047:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419056:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419064:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419071:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419077:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419083:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419088:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419094:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419100:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419106:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419112:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419117:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419123:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419130:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419138:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419145:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419151:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419156:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419163:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419169:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419174:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419179:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419184:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419189:qemu_savevm_send_postcopy_ram_discard s390.ram: 12d >> 28863@1493200554.419208:qemu_savevm_send_postcopy_ram_discard s390.ram: 9d >> 28863@1493200554.818195:migrate_handle_rp_req_pages in s390.ram at 3c6b000 >> len 1000 >> 28863@1493200554.818208:ram_save_queue_pages s390.ram: start: 3c6b000 len: >> 1000 >> 28863@1493200554.818469:get_queued_page_not_dirty s390.ram/3c6b000 >> ram_addr=3c6b000 (sent=0) > > OK, the source is ignoring the request because it thinks the page has > already been sent and hasn't been redirtied. > >> I also get >> ram_save_page: s390.ram:3c6b000 > > and you said in your other message that happens before the discards, which > means it's precopy, and that's why you don't see the 'place_page'. > >> target (without postcopy_place) >> 10929@1493200553.657966:ram_discard_range s390.ram: start: 0 5dc00000 > > That's start of day I think, so we can ignore that (I think!) > >> 10929@1493200554.428226:loadvm_postcopy_ram_handle_discard > > .... > >> 10929@1493200554.438231:ram_discard_range s390.ram: start: 3bf6000 1000 >> 10929@1493200554.438233:ram_discard_range s390.ram: start: 3bfd000 1000 >> 10929@1493200554.438235:ram_discard_range s390.ram: start: 3c06000 4000 >> 10929@1493200554.438240:loadvm_postcopy_ram_handle_discard >> 10929@1493200554.438241:ram_discard_range s390.ram: start: 3c79000 3000 > > Well, it's not discarded the 3c6b000 page, so the page we received > during precopy should still be there, and it shouldn't fault. > > In migration/ram.c in ram_load, before the line: > switch (flags & ~RAM_SAVE_FLAG_CONTINUE) { > > can you add: > fprintf(stderr, "%s: addr=%zx flags=%zx host=%p\n", __func__, addr, > flags, host); > > that should show the incoming blocks during precopy and we should be able > to see the page coming in. > Hmm, I have a theory, if the flags field has bit 1 set, i.e. > RAM_SAVE_FLAG_COMPRESS > then try changing ram_handle_compressed to always do the memset.
Lots of output with the following flags (some have bit 1 set, other not) flags=10 flags=22 flags=28 flags=4 flags=8 The interesting area is: ram_load: addr=3c6a000 flags=28 host=0x3ff21f6a000 ram_load: addr=3c6b000 flags=28 host=0x3ff21f6b000 ram_load: addr=3c6c000 flags=28 host=0x3ff21f6c000 ram_load: addr=3c6d000 flags=22 host=0x3ff21f6d000 ram_load: addr=3c6e000 flags=28 host=0x3ff21f6e000 ram_load: addr=3c6f000 flags=22 host=0x3ff21f6f000 and this time it hung on address 3c6d000 (which has a different flags than the others) I can send you the full file if you like but your theory seems to fully match the log.