On 26.05.25 12:19, Chenyi Qiang wrote:


On 5/26/2025 5:17 PM, David Hildenbrand wrote:
On 20.05.25 12:28, Chenyi Qiang wrote:
The current error handling is simple with the following assumption:
- QEMU will quit instead of resuming the guest if kvm_convert_memory()
    fails, thus no need to do rollback.
- The convert range is required to be in the desired state. It is not
    allowed to handle the mixture case.
- The conversion from shared to private is a non-failure operation.

This is sufficient for now as complext error handling is not required.
For future extension, add some potential error handling.
- For private to shared conversion, do the rollback operation if
    ram_block_attribute_notify_to_populated() fails.
- For shared to private conversion, still assert it as a non-failure
    operation for now. It could be an easy fail path with in-place
    conversion, which will likely have to retry the conversion until it
    works in the future.
- For mixture case, process individual blocks for ease of rollback.

Signed-off-by: Chenyi Qiang <chenyi.qi...@intel.com>
---
   system/ram-block-attribute.c | 116 +++++++++++++++++++++++++++--------
   1 file changed, 90 insertions(+), 26 deletions(-)

diff --git a/system/ram-block-attribute.c b/system/ram-block-attribute.c
index 387501b569..0af3396aa4 100644
--- a/system/ram-block-attribute.c
+++ b/system/ram-block-attribute.c
@@ -289,7 +289,12 @@ static int
ram_block_attribute_notify_to_discard(RamBlockAttribute *attr,
           }
           ret = rdl->notify_discard(rdl, &tmp);
           if (ret) {
-            break;
+            /*
+             * The current to_private listeners (VFIO dma_unmap and
+             * KVM set_attribute_private) are non-failing operations.
+             * TODO: add rollback operations if it is allowed to fail.
+             */
+            g_assert(ret);
           }
       }

If it's not allowed to fail for now, then patch #8 does not make sense
and should be dropped :)

It was intended for future extension as in-place conversion to_private
allows it to fail. So I add the patch #8.

But as you mentioned, since the conversion path is changing, and maybe
it is easier to handle from KVM code directly. Let me drop patch #8 and
wait for the in-place conversion to mature.

Makes sense. I'm afraid it might all be a bit complicated to handle: vfio can fail private -> shared conversion and KVM the shared -> private conversion.

So recovering ... will not be straight forward once multiple pages are converted.



The implementations (vfio) should likely exit() instead on unexpected
errors when discarding.

After drop patch #8, maybe keep vfio discard handling as it was. Adding
an additional exit() is also OK to me since it's non-fail case.




Why not squash all the below into the corresponding patch? Looks mostly
like handling partial conversions correctly (as discussed previously)?

I extract these two handling 1) mixture conversion; 2) operation
rollback into this individual patch because they are not the practical
cases and are untested.

For 1), I still don't see any real case which will convert a range with
mixture attributes.

Okay. I thought we were not sure if the guest could trigger that?

I think it would be better to just include the "mixture" handling in the original patch.


For 2), current failure of memory conversion (as seen in kvm_cpu_exec()
->kvm_convert_memory()) will cause the QEMU to quit instead of resuming
guest. Doing the rollback seems useless at present.

Makes sense.

--
Cheers,

David / dhildenb


Reply via email to