On 17.07.19 13:22, Michael S. Tsirkin wrote: > On Wed, Jul 17, 2019 at 01:10:21PM +0200, David Hildenbrand wrote: >> On 17.07.19 13:06, Michael S. Tsirkin wrote: >>> On Wed, Jul 17, 2019 at 12:17:57PM +0200, David Hildenbrand wrote: >>>> On 17.07.19 12:04, David Hildenbrand wrote: >>>>> On 17.07.19 11:57, Michael S. Tsirkin wrote: >>>>>> On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote: >>>>>>> We are using the wrong functions to set/clear bits, effectively touching >>>>>>> multiple bits, writing out of range of the bitmap, resulting in memory >>>>>>> corruptions. We have to use set_bit()/clear_bit() instead. >>>>>>> >>>>>>> Can easily be reproduced by starting a qemu guest on hugetlbfs memory, >>>>>>> inflating the balloon. QEMU crashes. This never could have worked >>>>>>> properly - especially, also pages would have been discarded when the >>>>>>> first sub-page would be inflated (the whole bitmap would be set). >>>>>>> >>>>>>> While testing I realized, that on hugetlbfs it is pretty much impossible >>>>>>> to discard a page - the guest just frees the 4k sub-pages in random >>>>>>> order >>>>>>> most of the time. I was only able to discard a hugepage a handful of >>>>>>> times - so I hope that now works correctly. >>>>>>> >>>>>>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < >>>>>>> host page size") >>>>>>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory >>>>>>> corruption >>>>>>> with inflates & deflates") >>>>>>> Cc: qemu-sta...@nongnu.org #v4.0.0 >>>>>>> Cc: Stefan Hajnoczi <stefa...@redhat.com> >>>>>>> Cc: David Gibson <da...@gibson.dropbear.id.au> >>>>>>> Cc: Michael S. Tsirkin <m...@redhat.com> >>>>>>> Cc: Igor Mammedov <imamm...@redhat.com> >>>>>>> Signed-off-by: David Hildenbrand <da...@redhat.com> >>>>>>> --- >>>>>>> hw/virtio/virtio-balloon.c | 10 ++++------ >>>>>>> 1 file changed, 4 insertions(+), 6 deletions(-) >>>>>>> >>>>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c >>>>>>> index e85d1c0d5c..669067d661 100644 >>>>>>> --- a/hw/virtio/virtio-balloon.c >>>>>>> +++ b/hw/virtio/virtio-balloon.c >>>>>>> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon >>>>>>> *balloon, >>>>>>> balloon->pbp->base = host_page_base; >>>>>>> } >>>>>>> >>>>>>> - bitmap_set(balloon->pbp->bitmap, >>>>>>> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, >>>>>>> - subpages); >>>>>>> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, >>>>>>> + balloon->pbp->bitmap); >>>>>>> >>>>>>> if (bitmap_full(balloon->pbp->bitmap, subpages)) { >>>>>>> /* We've accumulated a full host page, we can actually discard >>>>>>> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon >>>>>>> *balloon, >>>>>>> * for a guest to do this in practice, but handle it anyway, >>>>>>> * since getting it wrong could mean discarding memory the >>>>>>> * guest is still using. */ >>>>>>> - bitmap_clear(balloon->pbp->bitmap, >>>>>>> - (ram_offset - balloon->pbp->base) / >>>>>>> BALLOON_PAGE_SIZE, >>>>>>> - subpages); >>>>>>> + clear_bit((ram_offset - balloon->pbp->base) / >>>>>>> BALLOON_PAGE_SIZE, >>>>>>> + balloon->pbp->bitmap); >>>>>>> >>>>>>> if (bitmap_empty(balloon->pbp->bitmap, subpages)) { >>>>>>> g_free(balloon->pbp); >>>>>> >>>>>> I also started to wonder about this: >>>>>> >>>>>> if (!balloon->pbp) { >>>>>> /* Starting on a new host page */ >>>>>> size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long); >>>>>> balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + >>>>>> bitlen); >>>>>> balloon->pbp->rb = rb; >>>>>> balloon->pbp->base = host_page_base; >>>>>> } >>>>>> >>>>>> Is keeping a pointer to a ram block like this safe? what if the ramblock >>>>>> gets removed? >>>>>> >>>>> >>>>> David added >>>>> >>>>> if (balloon->pbp >>>>> && (rb != balloon->pbp->rb ) ... >>>>> >>>>> So in case the rb changes (IOW replaced - delete old one, new one >>>>> added), we reset the data. >>>>> >>>>> After a ram block was deleted, there will be no more deflation requests >>>>> coming in for it. This should be fine I guess. >>> >>> I think it might happen that an old dangling pointer happens >>> to match a newly allocated one. >>> I think we really should just cache all data we want to take into account >>> and compare that. >> >> That's true. I think just remembering and comparing the GPA base address >> would be sufficient. > > Well we need to know the bitmap size allocated, too. > And I guess when we are ready to free we should > re-check it just in case.
Right, either that or the page size, which is orthogonal. > >> However, I don't consider this here to trigger easily. We would need >> some crazy memory unplug+replug going on while using the balloon. So I >> assume we can just rework this part after 4.1 > > Dangling pointers are just a recipe for CVEs. I'd rather rework it now. > If they are not dereferences, I don't consider it an ultimate problem. But yeah, I'll look into that tomorrow. Can you pick up these patches in the meantime? Thanks! -- Thanks, David / dhildenb