*** This bug is a duplicate of bug 1572291 ***
    https://bugs.launchpad.net/bugs/1572291

------- Comment From [email protected] 2016-09-02 09:22 EDT-------
>From the dmesg it looks like this time ext4 page allocation stumbles upon the 
>doubly freed page first, but it is immediately after the page got corrupted by 
>the double free (indicated by the WARNING), so this just means that ext4 
>happened to be the first to get its fingers on the corrupted page during a 
>page alloc. It could hit anyone, and we also see later another occurrence 
>where copy_pte_range() stumbles over another corrupted page (no WARNING before 
>that because it is a WARN_ONCE).

We still need to find the root cause for the double free and the
resulting page corruption (count -1), and for that we only have the
WARNING trace as reliable hint for a double free. So my analysis from
comment #5 is still valid, even though this time genwqe itself is not
the one who stumbled over the corrupted page, it was still involved in
the double free (anyone can see the corrupted page afterwards, genwqe
was just a more likely candidate because it was an active consumer at
the time).

BTW, instead of "double free" of course a call of dma_free() on
previously unmapped addresses would result in the same issue, but a
double free is much more likely, e.g. caused by broken error handling
with "off by one" or other issues. Speaking of error handling, the
"genwqe 0001:00:00.0: [genwqe_map_pages] err: no dma addr
daddr=ffffffffffffffff!" messages may be a good starting point to verify
the genwqe error handling and the page freeing strategy. Those messages
by itself are no problem and even expected given the nature of the test
(online/offline and failing rpcit), but of course there is some error
handling involved which may have issues that could lead to a double
free.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1559194

Title:
  Bad page state in process genwqe_gunzip pfn:3c275 in the genwqe device
  driver

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-release-notes/+bug/1559194/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to