Re: [PATCH] vhost-vdpa: fix memory leak in error path
On 2020/9/9 下午11:41, Li Qiang wrote: Free the 'page_list' when the 'npages' is zero. Signed-off-by: Li Qiang --- drivers/vhost/vdpa.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 3fab94f88894..6a9fcaf1831d 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -609,8 +609,10 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, gup_flags |= FOLL_WRITE; npages = PAGE_ALIGN(msg->size + (iova & ~PAGE_MASK)) >> PAGE_SHIFT; - if (!npages) - return -EINVAL; + if (!npages) { + ret = -EINVAL; + goto free_page; + } mmap_read_lock(dev->mm); @@ -666,6 +668,8 @@ static int vhost_vdpa_process_iotlb_update(struct vhost_vdpa *v, atomic64_sub(npages, >mm->pinned_vm); } mmap_read_unlock(dev->mm); + +free_page: free_page((unsigned long)page_list); return ret; } Cc: sta...@vger.kernel.org Acked-by: Jason Wang ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vhost: new vhost_vdpa SET/GET_BACKEND_FEATURES handlers
Hi Zhu, url: https://github.com/0day-ci/linux/commits/Zhu-Lingshan/vhost-new-vhost_vdpa-SET-GET_BACKEND_FEATURES-handlers/20200909-115726 base: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git linux-next config: parisc-randconfig-m031-20200909 (attached as .config) compiler: hppa-linux-gcc (GCC) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot Reported-by: Dan Carpenter New smatch warnings: drivers/vhost/vdpa.c:356 vhost_vdpa_get_backend_features() warn: maybe return -EFAULT instead of the bytes remaining? # https://github.com/0day-ci/linux/commit/f2da8fc35b4ef003de7d559a8c7730fedf9926f8 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Zhu-Lingshan/vhost-new-vhost_vdpa-SET-GET_BACKEND_FEATURES-handlers/20200909-115726 git checkout f2da8fc35b4ef003de7d559a8c7730fedf9926f8 vim +356 drivers/vhost/vdpa.c f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 348 static long vhost_vdpa_get_backend_features(void __user *argp) f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 349 { f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 350 u64 features = VHOST_VDPA_BACKEND_FEATURES; f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 351 u64 __user *featurep = argp; f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 352 long r; f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 353 f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 354 r = copy_to_user(featurep, , sizeof(features)); f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 355 f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 @356 return r; copy_to_user() returns the number of bytes remaining. I haven't looked at how this is called but it should probably return negative error codes on error: if (copy_to_user(featurep, , sizeof(features))) return -EFAULT; return 0; f2da8fc35b4ef0 Zhu Lingshan 2020-09-09 357 } --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 71/72] x86/efi: Add GHCB mappings when SEV-ES is active
On 9/9/20 7:44 AM, Laszlo Ersek wrote: On 09/09/20 10:27, Ard Biesheuvel wrote: (adding Laszlo and Brijesh) On Tue, 8 Sep 2020 at 20:46, Borislav Petkov wrote: + Ard so that he can ack the efi bits. On Mon, Sep 07, 2020 at 03:16:12PM +0200, Joerg Roedel wrote: From: Tom Lendacky Calling down to EFI runtime services can result in the firmware performing VMGEXIT calls. The firmware is likely to use the GHCB of the OS (e.g., for setting EFI variables), I've had to stare at this for a while. Because, normally a VMGEXIT is supposed to occur like this: - guest does something privileged - resultant non-automatic exit (NAE) injects a #VC exception - exception handler figures out what that privileged thing was - exception handler submits request to hypervisor via GHCB contents plus VMGEXIT instruction Point being, the agent that "owns" the exception handler is supposed to pre-allocate or otherwise provide the GHCB too, for information passing. So... what is the particular NAE that occurs during the execution of UEFI runtime services (at OS runtime)? And assuming it occurs, I'm unsure if the exception handler (IDT) at that point is owned (temporarily?) by the firmware. - If the #VC handler comes from the firmware, then I don't know why it would use the OS's GHCB. - If the #VC handler comes from the OS, then I don't understand why the commit message says "firmware performing VMGEXIT", given that in this case it would be the OS's #VC handler executing VMGEXIT. So, I think the above commit message implies a VMGEXIT *without* a NAE / #VC context. (Because, I fail to interpret the commit message in a NAE / #VC context in any way; see above.) Correct. OK, so let's see where the firmware performs a VMGEXIT *outside* of an exception handler, *while* at OS runtime. There seems to be one, in file "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlashDxe.c": Again, correct. Basically this is what is invoked when setting UEFI variables. VOID QemuFlashPtrWrite ( INvolatile UINT8*Ptr, INUINT8 Value ) { if (MemEncryptSevEsIsEnabled ()) { MSR_SEV_ES_GHCB_REGISTER Msr; GHCB *Ghcb; Msr.GhcbPhysicalAddress = AsmReadMsr64 (MSR_SEV_ES_GHCB); Ghcb = Msr.Ghcb; // // Writing to flash is emulated by the hypervisor through the use of write // protection. This won't work for an SEV-ES guest because the write won't // be recognized as a true MMIO write, which would result in the required // #VC exception. Instead, use the the VMGEXIT MMIO write support directly // to perform the update. // VmgInit (Ghcb); Ghcb->SharedBuffer[0] = Value; Ghcb->SaveArea.SwScratch = (UINT64) (UINTN) Ghcb->SharedBuffer; VmgExit (Ghcb, SVM_EXIT_MMIO_WRITE, (UINT64) (UINTN) Ptr, 1); VmgDone (Ghcb); } else { *Ptr = Value; } } This function *does* run at OS runtime (as a part of non-volatile UEFI variable writes). And note that, wherever MSR_SEV_ES_GHCB points to at the moment, is used as GHCB. If the guest kernel allocates its own GHCB and writes the allocation address to MSR_SEV_ES_GHCB, then indeed the firmware will use the GHCB of the OS. I reviewed edk2 commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass flash detection with SEV-ES", 2020-08-17), but I admit I never thought of the guest OS changing MSR_SEV_ES_GHCB. I'm sorry about that. As long as this driver is running before OS runtime (i.e., during the DXE and BDS phases), MSR_SEV_ES_GHCB is supposed to carry the value we set in "OvmfPkg/PlatformPei/AmdSev.c": STATIC VOID AmdSevEsInitialize ( VOID ) { VOID *GhcbBase; PHYSICAL_ADDRESS GhcbBasePa; UINTN GhcbPageCount, PageCount; RETURN_STATUS PcdStatus, DecryptStatus; IA32_DESCRIPTOR Gdtr; VOID *Gdt; if (!MemEncryptSevEsIsEnabled ()) { return; } PcdStatus = PcdSetBoolS (PcdSevEsIsEnabled, TRUE); ASSERT_RETURN_ERROR (PcdStatus); // // Allocate GHCB and per-CPU variable pages. // Since the pages must survive across the UEFI to OS transition // make them reserved. // GhcbPageCount = mMaxCpuCount * 2; GhcbBase = AllocateReservedPages (GhcbPageCount); ASSERT (GhcbBase != NULL); GhcbBasePa = (PHYSICAL_ADDRESS)(UINTN) GhcbBase; // // Each vCPU gets two consecutive pages, the first is the GHCB and the // second is the per-CPU variable page. Loop through the allocation and // only clear the encryption mask for the GHCB pages. // for (PageCount = 0; PageCount < GhcbPageCount; PageCount += 2) { DecryptStatus = MemEncryptSevClearPageEncMask ( 0, GhcbBasePa + EFI_PAGES_TO_SIZE (PageCount), 1, TRUE ); ASSERT_RETURN_ERROR (DecryptStatus); } ZeroMem (GhcbBase, EFI_PAGES_TO_SIZE (GhcbPageCount)); PcdStatus = PcdSet64S (PcdGhcbBase, GhcbBasePa); ASSERT_RETURN_ERROR
Re: [PATCH v7 71/72] x86/efi: Add GHCB mappings when SEV-ES is active
On 09/09/20 14:44, Laszlo Ersek wrote: > To summarize: for QemuFlashFvbServicesRuntimeDxe to allocate UEFI > Runtime Services Data type memory, for its own runtime GHCB, two > permissions are necessary (together), at OS runtime: > > - QemuFlashFvbServicesRuntimeDxe must be allowed to swap MSR_SEV_ES_GHCB > temporarily (before executing VMGEXIT), > > - QemuFlashFvbServicesRuntimeDxe must be allowed to change the OS-owned > PTE temporarily (for remapping the GHCB as plaintext, before writing > to it). Condition#2 gets worse: If the firmware-allocated runtime GHCB happens to be virt-mapped by the OS using a 2MB page (covering other UEFI runtime data areas, perhaps?), then simply flipping the encryption bit in QemuFlashFvbServicesRuntimeDxe would mark more runtime memory than just the GHCB as "plaintext". (2MB-4KB specifically.) This could result in: - firmware read accesses outside of the GHCB returning garbage - firmware write accesses ouside of the GHCB leaking information to the hypervisor, and reading back later as garbage In order to prevent those symptoms, the firmware would have to split the 2MB page to 4KB pages, and decrypt just the one (GHCB) page. But page splitting requires additional memory (for the finer granularity page table), and fw memory cannot be allocated at OS runtime. So that extra memory too would have to be pre-allocated by the firmware. Nasty. I'd recommend sticking with this kernel patch. Thanks, Laszlo ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v7 71/72] x86/efi: Add GHCB mappings when SEV-ES is active
On 09/09/20 10:27, Ard Biesheuvel wrote: > (adding Laszlo and Brijesh) > > On Tue, 8 Sep 2020 at 20:46, Borislav Petkov wrote: >> >> + Ard so that he can ack the efi bits. >> >> On Mon, Sep 07, 2020 at 03:16:12PM +0200, Joerg Roedel wrote: >>> From: Tom Lendacky >>> >>> Calling down to EFI runtime services can result in the firmware >>> performing VMGEXIT calls. The firmware is likely to use the GHCB of >>> the OS (e.g., for setting EFI variables), I've had to stare at this for a while. Because, normally a VMGEXIT is supposed to occur like this: - guest does something privileged - resultant non-automatic exit (NAE) injects a #VC exception - exception handler figures out what that privileged thing was - exception handler submits request to hypervisor via GHCB contents plus VMGEXIT instruction Point being, the agent that "owns" the exception handler is supposed to pre-allocate or otherwise provide the GHCB too, for information passing. So... what is the particular NAE that occurs during the execution of UEFI runtime services (at OS runtime)? And assuming it occurs, I'm unsure if the exception handler (IDT) at that point is owned (temporarily?) by the firmware. - If the #VC handler comes from the firmware, then I don't know why it would use the OS's GHCB. - If the #VC handler comes from the OS, then I don't understand why the commit message says "firmware performing VMGEXIT", given that in this case it would be the OS's #VC handler executing VMGEXIT. So, I think the above commit message implies a VMGEXIT *without* a NAE / #VC context. (Because, I fail to interpret the commit message in a NAE / #VC context in any way; see above.) OK, so let's see where the firmware performs a VMGEXIT *outside* of an exception handler, *while* at OS runtime. There seems to be one, in file "OvmfPkg/QemuFlashFvbServicesRuntimeDxe/QemuFlashDxe.c": > VOID > QemuFlashPtrWrite ( > INvolatile UINT8*Ptr, > INUINT8 Value > ) > { > if (MemEncryptSevEsIsEnabled ()) { > MSR_SEV_ES_GHCB_REGISTER Msr; > GHCB *Ghcb; > > Msr.GhcbPhysicalAddress = AsmReadMsr64 (MSR_SEV_ES_GHCB); > Ghcb = Msr.Ghcb; > > // > // Writing to flash is emulated by the hypervisor through the use of write > // protection. This won't work for an SEV-ES guest because the write won't > // be recognized as a true MMIO write, which would result in the required > // #VC exception. Instead, use the the VMGEXIT MMIO write support directly > // to perform the update. > // > VmgInit (Ghcb); > Ghcb->SharedBuffer[0] = Value; > Ghcb->SaveArea.SwScratch = (UINT64) (UINTN) Ghcb->SharedBuffer; > VmgExit (Ghcb, SVM_EXIT_MMIO_WRITE, (UINT64) (UINTN) Ptr, 1); > VmgDone (Ghcb); > } else { > *Ptr = Value; > } > } This function *does* run at OS runtime (as a part of non-volatile UEFI variable writes). And note that, wherever MSR_SEV_ES_GHCB points to at the moment, is used as GHCB. If the guest kernel allocates its own GHCB and writes the allocation address to MSR_SEV_ES_GHCB, then indeed the firmware will use the GHCB of the OS. I reviewed edk2 commit 437eb3f7a8db ("OvmfPkg/QemuFlashFvbServicesRuntimeDxe: Bypass flash detection with SEV-ES", 2020-08-17), but I admit I never thought of the guest OS changing MSR_SEV_ES_GHCB. I'm sorry about that. As long as this driver is running before OS runtime (i.e., during the DXE and BDS phases), MSR_SEV_ES_GHCB is supposed to carry the value we set in "OvmfPkg/PlatformPei/AmdSev.c": > STATIC > VOID > AmdSevEsInitialize ( > VOID > ) > { > VOID *GhcbBase; > PHYSICAL_ADDRESS GhcbBasePa; > UINTN GhcbPageCount, PageCount; > RETURN_STATUS PcdStatus, DecryptStatus; > IA32_DESCRIPTOR Gdtr; > VOID *Gdt; > > if (!MemEncryptSevEsIsEnabled ()) { > return; > } > > PcdStatus = PcdSetBoolS (PcdSevEsIsEnabled, TRUE); > ASSERT_RETURN_ERROR (PcdStatus); > > // > // Allocate GHCB and per-CPU variable pages. > // Since the pages must survive across the UEFI to OS transition > // make them reserved. > // > GhcbPageCount = mMaxCpuCount * 2; > GhcbBase = AllocateReservedPages (GhcbPageCount); > ASSERT (GhcbBase != NULL); > > GhcbBasePa = (PHYSICAL_ADDRESS)(UINTN) GhcbBase; > > // > // Each vCPU gets two consecutive pages, the first is the GHCB and the > // second is the per-CPU variable page. Loop through the allocation and > // only clear the encryption mask for the GHCB pages. > // > for (PageCount = 0; PageCount < GhcbPageCount; PageCount += 2) { > DecryptStatus = MemEncryptSevClearPageEncMask ( > 0, > GhcbBasePa + EFI_PAGES_TO_SIZE (PageCount), > 1, > TRUE > ); > ASSERT_RETURN_ERROR (DecryptStatus); > } > > ZeroMem (GhcbBase, EFI_PAGES_TO_SIZE (GhcbPageCount)); > > PcdStatus = PcdSet64S (PcdGhcbBase, GhcbBasePa); > ASSERT_RETURN_ERROR (PcdStatus); > PcdStatus =
Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
On 09.09.20 13:37, David Hildenbrand wrote: > On 09.09.20 13:24, Michael Ellerman wrote: >> David Hildenbrand writes: >>> On 09.09.20 09:17, Greg Kroah-Hartman wrote: On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote: > We soon want to pass flags, e.g., to mark added System RAM resources. > mergeable. Prepare for that. What are these random "flags", and how do we know what should be passed to them? Why not make this an enumerated type so that we know it all works properly, like the GPF_* flags are? Passing around a random unsigned long feels very odd/broken... >>> >>> Agreed, an enum (mhp_flags) seems to give a better hint what can >>> actually be passed. Thanks! >> >> You probably know this but ... >> >> Just using a C enum doesn't get you any type safety. >> >> You can get some checking via sparse by using __bitwise, which is what >> gfp_t does. You don't actually have to use an enum for that, it works >> with #defines also. > > Yeah, we seem to be using different approaches. And there is always a > way to mess things up :) > > gfp_t is one (extreme) example, enum memblock_flags is another example. > I tend to prefer an enum in this particular case, because it's simple > and at least tells the user which values are expected. > Gave it another try, looks like mhp_t (like gfp_t) is actually nicer. -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
On 09.09.20 13:24, Michael Ellerman wrote: > David Hildenbrand writes: >> On 09.09.20 09:17, Greg Kroah-Hartman wrote: >>> On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote: We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. >>> >>> What are these random "flags", and how do we know what should be passed >>> to them? >>> >>> Why not make this an enumerated type so that we know it all works >>> properly, like the GPF_* flags are? Passing around a random unsigned >>> long feels very odd/broken... >> >> Agreed, an enum (mhp_flags) seems to give a better hint what can >> actually be passed. Thanks! > > You probably know this but ... > > Just using a C enum doesn't get you any type safety. > > You can get some checking via sparse by using __bitwise, which is what > gfp_t does. You don't actually have to use an enum for that, it works > with #defines also. Yeah, we seem to be using different approaches. And there is always a way to mess things up :) gfp_t is one (extreme) example, enum memblock_flags is another example. I tend to prefer an enum in this particular case, because it's simple and at least tells the user which values are expected. Thoughts? > > Or you can wrap the flag in a struct, the way atomic_t does, and then > the compiler will prevent passing plain integers in place of your custom > type. -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
David Hildenbrand writes: > On 09.09.20 09:17, Greg Kroah-Hartman wrote: >> On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote: >>> We soon want to pass flags, e.g., to mark added System RAM resources. >>> mergeable. Prepare for that. >> >> What are these random "flags", and how do we know what should be passed >> to them? >> >> Why not make this an enumerated type so that we know it all works >> properly, like the GPF_* flags are? Passing around a random unsigned >> long feels very odd/broken... > > Agreed, an enum (mhp_flags) seems to give a better hint what can > actually be passed. Thanks! You probably know this but ... Just using a C enum doesn't get you any type safety. You can get some checking via sparse by using __bitwise, which is what gfp_t does. You don't actually have to use an enum for that, it works with #defines also. Or you can wrap the flag in a struct, the way atomic_t does, and then the compiler will prevent passing plain integers in place of your custom type. cheers ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] vhost_vdpa: remove unnecessary spin_lock in vhost_vring_call
On 2020/9/9 下午2:52, Zhu Lingshan wrote: This commit removed unnecessary spin_locks in vhost_vring_call and related operations. Because we manipulate irq offloading contents in vhost_vdpa ioctl code path which is already protected by dev mutex and vq mutex. Signed-off-by: Zhu Lingshan Acked-by: Jason Wang --- drivers/vhost/vdpa.c | 8 +--- drivers/vhost/vhost.c | 3 --- drivers/vhost/vhost.h | 1 - 3 files changed, 1 insertion(+), 11 deletions(-) diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index 3fab94f88894..bc679d0b7b87 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -97,26 +97,20 @@ static void vhost_vdpa_setup_vq_irq(struct vhost_vdpa *v, u16 qid) return; irq = ops->get_vq_irq(vdpa, qid); - spin_lock(>call_ctx.ctx_lock); irq_bypass_unregister_producer(>call_ctx.producer); - if (!vq->call_ctx.ctx || irq < 0) { - spin_unlock(>call_ctx.ctx_lock); + if (!vq->call_ctx.ctx || irq < 0) return; - } vq->call_ctx.producer.token = vq->call_ctx.ctx; vq->call_ctx.producer.irq = irq; ret = irq_bypass_register_producer(>call_ctx.producer); - spin_unlock(>call_ctx.ctx_lock); } static void vhost_vdpa_unsetup_vq_irq(struct vhost_vdpa *v, u16 qid) { struct vhost_virtqueue *vq = >vqs[qid]; - spin_lock(>call_ctx.ctx_lock); irq_bypass_unregister_producer(>call_ctx.producer); - spin_unlock(>call_ctx.ctx_lock); } static void vhost_vdpa_reset(struct vhost_vdpa *v) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index b45519ca66a7..99f27ce982da 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -302,7 +302,6 @@ static void vhost_vring_call_reset(struct vhost_vring_call *call_ctx) { call_ctx->ctx = NULL; memset(_ctx->producer, 0x0, sizeof(struct irq_bypass_producer)); - spin_lock_init(_ctx->ctx_lock); } static void vhost_vq_reset(struct vhost_dev *dev, @@ -1637,9 +1636,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg break; } - spin_lock(>call_ctx.ctx_lock); swap(ctx, vq->call_ctx.ctx); - spin_unlock(>call_ctx.ctx_lock); break; case VHOST_SET_VRING_ERR: if (copy_from_user(, argp, sizeof f)) { diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 9032d3c2a9f4..486dcf371e06 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -64,7 +64,6 @@ enum vhost_uaddr_type { struct vhost_vring_call { struct eventfd_ctx *ctx; struct irq_bypass_producer producer; - spinlock_t ctx_lock; }; /* The virtqueue structure describes a queue attached to a device. */ ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] i2c: virtio: add a virtio i2c frontend driver
On 2020/9/3 下午1:34, Jie Deng wrote: --- a/drivers/i2c/busses/Makefile +++ b/drivers/i2c/busses/Makefile @@ -6,6 +6,9 @@ # ACPI drivers obj-$(CONFIG_I2C_SCMI)+= i2c-scmi.o +# VIRTIO I2C host controller driver +obj-$(CONFIG_I2C_VIRTIO) += i2c-virtio.o + # PC SMBus host controller drivers obj-$(CONFIG_I2C_ALI1535) += i2c-ali1535.o obj-$(CONFIG_I2C_ALI1563) += i2c-ali1563.o diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c new file mode 100644 index 000..47f9fd1 --- /dev/null +++ b/drivers/i2c/busses/i2c-virtio.c @@ -0,0 +1,276 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Virtio I2C Bus Driver + * + * Copyright (c) 2020 Intel Corporation. All rights reserved. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#define VIRTIO_I2C_MSG_OK 0 +#define VIRTIO_I2C_MSG_ERR 1 + +/** + * struct virtio_i2c_hdr - the virtio I2C message header structure + * @addr: i2c_msg addr, the slave address + * @flags: i2c_msg flags + * @len: i2c_msg len + */ +struct virtio_i2c_hdr { + __virtio16 addr; + __virtio16 flags; + __virtio16 len; +} __packed; Btw, this part should belong to uAPI, and you need to define the status in uAPI. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] i2c: virtio: add a virtio i2c frontend driver
On 2020/9/8 上午9:40, Jie Deng wrote: On 2020/9/7 13:40, Jason Wang wrote: +struct virtio_i2c_msg { + struct virtio_i2c_hdr hdr; + char *buf; + u8 status; Any reason for separating status out of virtio_i2c_hdr? The status is not from i2c_msg. You meant ic2_hdr? You embed status in virtio_i2c_msg anyway. The "i2c_msg" structure defined in i2c.h. So I put it out of virtio_i2c_hdr. Something like status or response is pretty common in virtio request (e.g net or scsi), if no special reason, it's better to keep it in the hdr. Mainly based on IN or OUT. The addr, flags and len are from "i2c_msg". They are put in one structure as an OUT**scatterlist. The buf can be an OUT**or an IN scatterlist depending on write or read. The status is a result from the backend which is defined as an IN scatterlis. Ok. I get this. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [RFC PATCH 00/22] Enhance VHOST to enable SoC-to-SoC communication
On 2020/9/9 上午12:37, Cornelia Huck wrote: Then you need something that is functional equivalent to virtio PCI which is actually the concept of vDPA (e.g vDPA provides alternatives if the queue_sel is hard in the EP implementation). It seems I really need to read up on vDPA more... do you have a pointer for diving into this alternatives aspect? See vpda_config_ops in include/linux/vdpa.h Especially this part: int (*set_vq_address)(struct vdpa_device *vdev, u16 idx, u64 desc_area, u64 driver_area, u64 device_area); This means for the devices (e.g endpoint device) that is hard to implement virtio-pci layout, it can use any other register layout or vendor specific way to configure the virtqueue. "Virtio Over NTB" should anyways be a new transport. Does that make any sense? yeah, in the approach I used the initial features are hard-coded in vhost-rpmsg (inherent to the rpmsg) but when we have to use adapter layer (vhost only for accessing virtio ring and use virtio drivers on both front end and backend), based on the functionality (e.g, rpmsg), the vhost should be configured with features (to be presented to the virtio) and that's why additional layer or APIs will be required. A question here, if we go with vhost bus approach, does it mean the virtio device can only be implemented in EP's userspace? Can we maybe implement an alternative bus as well that would allow us to support different virtio device implementations (in addition to the vhost bus + userspace combination)? That should be fine, but I'm not quite sure that implementing the device in kerne (kthread) is the good approach. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 2/7] kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED
On 09.09.20 09:16, Greg Kroah-Hartman wrote: > On Tue, Sep 08, 2020 at 10:10:07PM +0200, David Hildenbrand wrote: >> IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is >> always set to 0 by hardware. This is far from beautiful (and confusing), >> and the bit only applies to SYSRAM. So let's move it out of the >> bus-specific (PnP) defined bits. >> >> We'll add another SYSRAM specific bit soon. If we ever need more bits for >> other purposes, we can steal some from "desc", or reshuffle/regroup what we >> have. >> >> Cc: Andrew Morton >> Cc: Michal Hocko >> Cc: Dan Williams >> Cc: Jason Gunthorpe >> Cc: Kees Cook >> Cc: Ard Biesheuvel >> Cc: Pankaj Gupta >> Cc: Baoquan He >> Cc: Wei Yang >> Cc: Eric Biederman >> Cc: Thomas Gleixner >> Cc: Greg Kroah-Hartman >> Cc: ke...@lists.infradead.org >> Signed-off-by: David Hildenbrand >> --- >> include/linux/ioport.h | 4 +++- >> kernel/kexec_file.c| 2 +- >> mm/memory_hotplug.c| 4 ++-- >> 3 files changed, 6 insertions(+), 4 deletions(-) >> >> diff --git a/include/linux/ioport.h b/include/linux/ioport.h >> index 52a91f5fa1a36..d7620d7c941a0 100644 >> --- a/include/linux/ioport.h >> +++ b/include/linux/ioport.h >> @@ -58,6 +58,9 @@ struct resource { >> #define IORESOURCE_EXT_TYPE_BITS 0x0100 /* Resource extended types */ >> #define IORESOURCE_SYSRAM 0x0100 /* System RAM (modifier) */ >> >> +/* IORESOURCE_SYSRAM specific bits. */ >> +#define IORESOURCE_SYSRAM_DRIVER_MANAGED0x0200 /* Always detected >> via a driver. */ >> + > > Can't you use BIT() here? I could, but this will make it look different to all other IORESOURCE_* definitions? If so, we should change all existing definitions - however the ones spanning multiple bits might turn out rather ugly, like #define IORESOURCE_TYPE_BITS0x1f00 Thoughts? Thanks > > thanks, > > greg k-h > -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
On 09.09.20 09:17, Greg Kroah-Hartman wrote: > On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote: >> We soon want to pass flags, e.g., to mark added System RAM resources. >> mergeable. Prepare for that. > > What are these random "flags", and how do we know what should be passed > to them? > > Why not make this an enumerated type so that we know it all works > properly, like the GPF_* flags are? Passing around a random unsigned > long feels very odd/broken... Agreed, an enum (mhp_flags) seems to give a better hint what can actually be passed. Thanks! -- Thanks, David / dhildenb ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 3/7] mm/memory_hotplug: prepare passing flags to add_memory() and friends
On Tue, Sep 08, 2020 at 10:10:08PM +0200, David Hildenbrand wrote: > We soon want to pass flags, e.g., to mark added System RAM resources. > mergeable. Prepare for that. What are these random "flags", and how do we know what should be passed to them? Why not make this an enumerated type so that we know it all works properly, like the GPF_* flags are? Passing around a random unsigned long feels very odd/broken... thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 2/7] kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED
On Tue, Sep 08, 2020 at 10:10:07PM +0200, David Hildenbrand wrote: > IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is > always set to 0 by hardware. This is far from beautiful (and confusing), > and the bit only applies to SYSRAM. So let's move it out of the > bus-specific (PnP) defined bits. > > We'll add another SYSRAM specific bit soon. If we ever need more bits for > other purposes, we can steal some from "desc", or reshuffle/regroup what we > have. > > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Dan Williams > Cc: Jason Gunthorpe > Cc: Kees Cook > Cc: Ard Biesheuvel > Cc: Pankaj Gupta > Cc: Baoquan He > Cc: Wei Yang > Cc: Eric Biederman > Cc: Thomas Gleixner > Cc: Greg Kroah-Hartman > Cc: ke...@lists.infradead.org > Signed-off-by: David Hildenbrand > --- > include/linux/ioport.h | 4 +++- > kernel/kexec_file.c| 2 +- > mm/memory_hotplug.c| 4 ++-- > 3 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/include/linux/ioport.h b/include/linux/ioport.h > index 52a91f5fa1a36..d7620d7c941a0 100644 > --- a/include/linux/ioport.h > +++ b/include/linux/ioport.h > @@ -58,6 +58,9 @@ struct resource { > #define IORESOURCE_EXT_TYPE_BITS 0x0100 /* Resource extended types */ > #define IORESOURCE_SYSRAM0x0100 /* System RAM (modifier) */ > > +/* IORESOURCE_SYSRAM specific bits. */ > +#define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x0200 /* Always detected > via a driver. */ > + Can't you use BIT() here? thanks, greg k-h ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization