Re: [PATCH v2] Shared memory device with interrupt support

Avi Kivity Wed, 20 May 2009 07:26:34 -0700

Anthony Liguori wrote:

I'd strongly recommend working these patches on qemu-devel andlkml. I suspect Avi may disagree with me, but in order for thisto be eventually merged in either place, you're going to haveadditional requirements put on you.
I don't disagree with the fact that there will be additionalrequirements, but I might disagree with some of those additionalrequirements themselves.
It actually works out better than I think you expect it to...
Can you explain why? You haven't addressed my concerns the last timearound.
Because of the qemu_ram_alloc() patches. We no longer have acontiguous phys_ram_base so we don't have to deal withmmap(MAP_FIXED). We can also more practically do memory hot-add whichis more or less a requirement of this work.

I think you're arguing my side. If the guest specifies the memory to beshared via an add_buf() sglist allocated from its free memory, you haveto use MAP_FIXED (since the gpa->hva mapping is already fixed for guestmemory). If it's provided as a BAR or equivalent, we can use a variantof qemu_ram_alloc() which binds to the shared segment instead of allocating.

It also means we could do shared memory through more traditional meanstoo like sys v ipc or whatever is the native mechanism on theunderlying platform. That means we could even support Win32 (althoughI wouldn't make that an initial requirement).


Not with add_buf() memory...

We can't use mmap() directly. With the new RAM allocation scheme, Ithink it's pretty reasonable to now allow portions of ram to comefrom files that get mmap() (sort of like -mem-path).
This RAM area could be setup as a BAR.
That's what Cam's patch does, and what you objected to.
I'm flexible. BARs are pretty unattractive because of the sizerequirements.

What size requirements? The PCI memory hole? Those requirements areeasily lifted.

The actual transport implementation is the least important part thoughIMHO. The guest interface and how it's implemented within QEMU ismuch more important to get right the first time.


I agree, with much more emphasis on the guest/host interface.

Why is that unimplementable?
Bad choice of words - it's implementable, just not very usable. Youcan't share 1GB in a 256MB guest, will fragment host vmas, noguarantee the guest can actually allocate all that memory, doesn'twork with large pages, what happens on freeing, etc.
You can share 1GB with a PCI BAR today. You're limited to 32-bitaddresses which admittedly we could fix.
Any reason to bother with BARs instead of just picking unused physicaladdresses? Does Windows do anything special with BAR addresses?

If you use a BAR you let the host kernel know what you're doing. Nodoubt you could do the same thing yourself (the PCI support functionscall the raw support functions), but if you use a BAR, everything fromthe BIOS onwards is plumbed down.

Sure we could do something independent a la vbus, but my preference hasalways been to behave like real hardware.

Oh, and if it's a BAR you can use device assignment. You can't assign adevice that exposes memory the host doesn't know about.

The QEMU bits and the device model bits are actually relativelysimple. The part that I think needs more deep thought is theguest-visible interface.
A char device is probably not the best interface. I think you wantsomething like tmpfs/hugetlbfs.
Yes those are so wonderful to work with.
qemu -ivshmemfile=/dev/shm/ring.shared,name=shared-ring,size=1G,notify=/path/to/socket
/path/to/socket is used to pass an eventfd

Within the guest, you'd have:

/dev/ivshmemfs/shared-ring
An app would mmap() that file, and then could do something like anioctl() to get an eventfd.
Alternatively, you could have something like:

/dev/ivshmemfs/mem/shared-ring
/dev/ivshmemfs/notify/shared-ring

Where notify/shared-ring behaves like an eventfd().

Being the traditionalist that I am, I'd much prefer it to be a chardevice and use udev rules to get a meaningful name if needed. That'show every other real device works.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] Shared memory device with interrupt support

Reply via email to