Re: [PATCH v2] virtio_balloon: export hugetlb page allocation counts
> On Mar 19, 2018, at 7:31 PM, Jason Wangwrote: > > > > On 2018年03月20日 06:14, Jonathan Helman wrote: >> Export the number of successful and failed hugetlb page >> allocations via the virtio balloon driver. These 2 counts >> come directly from the vm_events HTLB_BUDDY_PGALLOC and >> HTLB_BUDDY_PGALLOC_FAIL. >> >> Signed-off-by: Jonathan Helman > > Reviewed-by: Jason Wang Thanks. > >> --- >> drivers/virtio/virtio_balloon.c | 6 ++ >> include/uapi/linux/virtio_balloon.h | 4 +++- >> 2 files changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/virtio/virtio_balloon.c >> b/drivers/virtio/virtio_balloon.c >> index dfe5684..6b237e3 100644 >> --- a/drivers/virtio/virtio_balloon.c >> +++ b/drivers/virtio/virtio_balloon.c >> @@ -272,6 +272,12 @@ static unsigned int update_balloon_stats(struct >> virtio_balloon *vb) >> pages_to_bytes(events[PSWPOUT])); >> update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]); >> update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]); >> +#ifdef CONFIG_HUGETLB_PAGE >> +update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGALLOC, >> +events[HTLB_BUDDY_PGALLOC]); >> +update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGFAIL, >> +events[HTLB_BUDDY_PGALLOC_FAIL]); >> +#endif >> #endif >> update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE, >> pages_to_bytes(i.freeram)); >> diff --git a/include/uapi/linux/virtio_balloon.h >> b/include/uapi/linux/virtio_balloon.h >> index 4e8b830..40297a3 100644 >> --- a/include/uapi/linux/virtio_balloon.h >> +++ b/include/uapi/linux/virtio_balloon.h >> @@ -53,7 +53,9 @@ struct virtio_balloon_config { >> #define VIRTIO_BALLOON_S_MEMTOT 5 /* Total amount of memory */ >> #define VIRTIO_BALLOON_S_AVAIL6 /* Available memory as in /proc */ >> #define VIRTIO_BALLOON_S_CACHES 7 /* Disk caches */ >> -#define VIRTIO_BALLOON_S_NR 8 >> +#define VIRTIO_BALLOON_S_HTLB_PGALLOC 8 /* Hugetlb page allocations */ >> +#define VIRTIO_BALLOON_S_HTLB_PGFAIL 9 /* Hugetlb page allocation >> failures */ >> +#define VIRTIO_BALLOON_S_NR 10 >>/* >> * Memory statistics structure. > > Not for this patch, but it looks to me that exporting such nr through uapi is > fragile. Sorry, can you explain what you mean here? Jon > > Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2] virtio_balloon: export hugetlb page allocation counts
On 2018年03月20日 06:14, Jonathan Helman wrote: Export the number of successful and failed hugetlb page allocations via the virtio balloon driver. These 2 counts come directly from the vm_events HTLB_BUDDY_PGALLOC and HTLB_BUDDY_PGALLOC_FAIL. Signed-off-by: Jonathan HelmanReviewed-by: Jason Wang --- drivers/virtio/virtio_balloon.c | 6 ++ include/uapi/linux/virtio_balloon.h | 4 +++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index dfe5684..6b237e3 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -272,6 +272,12 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb) pages_to_bytes(events[PSWPOUT])); update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]); update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]); +#ifdef CONFIG_HUGETLB_PAGE + update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGALLOC, + events[HTLB_BUDDY_PGALLOC]); + update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGFAIL, + events[HTLB_BUDDY_PGALLOC_FAIL]); +#endif #endif update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE, pages_to_bytes(i.freeram)); diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h index 4e8b830..40297a3 100644 --- a/include/uapi/linux/virtio_balloon.h +++ b/include/uapi/linux/virtio_balloon.h @@ -53,7 +53,9 @@ struct virtio_balloon_config { #define VIRTIO_BALLOON_S_MEMTOT 5 /* Total amount of memory */ #define VIRTIO_BALLOON_S_AVAIL6 /* Available memory as in /proc */ #define VIRTIO_BALLOON_S_CACHES 7 /* Disk caches */ -#define VIRTIO_BALLOON_S_NR 8 +#define VIRTIO_BALLOON_S_HTLB_PGALLOC 8 /* Hugetlb page allocations */ +#define VIRTIO_BALLOON_S_HTLB_PGFAIL 9 /* Hugetlb page allocation failures */ +#define VIRTIO_BALLOON_S_NR 10 /* * Memory statistics structure. Not for this patch, but it looks to me that exporting such nr through uapi is fragile. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] ptr_ring: fix build
On 2018年03月20日 08:41, Michael S. Tsirkin wrote: Fixes after recent use of kvmalloc Signed-off-by: Michael S. Tsirkin--- tools/virtio/ringtest/ptr_ring.c | 5 + 1 file changed, 5 insertions(+) diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c index 477899c..2d566fb 100644 --- a/tools/virtio/ringtest/ptr_ring.c +++ b/tools/virtio/ringtest/ptr_ring.c @@ -17,6 +17,8 @@ #define likely(x)(__builtin_expect(!!(x), 1)) #define ALIGN(x, a) (((x) + (a) - 1) / (a) * (a)) #define SIZE_MAX(~(size_t)0) +#define KMALLOC_MAX_SIZE SIZE_MAX +#define BUG_ON(x) assert(x) typedef pthread_spinlock_t spinlock_t; @@ -57,6 +59,9 @@ static void kfree(void *p) free(p); } +#define kvmalloc_array kmalloc_array +#define kvfree kfree + static void spin_lock_init(spinlock_t *lock) { int r = pthread_spin_init(lock, 0); Acked-by: Jason Wang ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH] ptr_ring: fix build
Fixes after recent use of kvmalloc Signed-off-by: Michael S. Tsirkin--- tools/virtio/ringtest/ptr_ring.c | 5 + 1 file changed, 5 insertions(+) diff --git a/tools/virtio/ringtest/ptr_ring.c b/tools/virtio/ringtest/ptr_ring.c index 477899c..2d566fb 100644 --- a/tools/virtio/ringtest/ptr_ring.c +++ b/tools/virtio/ringtest/ptr_ring.c @@ -17,6 +17,8 @@ #define likely(x)(__builtin_expect(!!(x), 1)) #define ALIGN(x, a) (((x) + (a) - 1) / (a) * (a)) #define SIZE_MAX(~(size_t)0) +#define KMALLOC_MAX_SIZE SIZE_MAX +#define BUG_ON(x) assert(x) typedef pthread_spinlock_t spinlock_t; @@ -57,6 +59,9 @@ static void kfree(void *p) free(p); } +#define kvmalloc_array kmalloc_array +#define kvfree kfree + static void spin_lock_init(spinlock_t *lock) { int r = pthread_spin_init(lock, 0); -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v2] virtio_balloon: export hugetlb page allocation counts
Export the number of successful and failed hugetlb page allocations via the virtio balloon driver. These 2 counts come directly from the vm_events HTLB_BUDDY_PGALLOC and HTLB_BUDDY_PGALLOC_FAIL. Signed-off-by: Jonathan Helman--- drivers/virtio/virtio_balloon.c | 6 ++ include/uapi/linux/virtio_balloon.h | 4 +++- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index dfe5684..6b237e3 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -272,6 +272,12 @@ static unsigned int update_balloon_stats(struct virtio_balloon *vb) pages_to_bytes(events[PSWPOUT])); update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]); update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]); +#ifdef CONFIG_HUGETLB_PAGE + update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGALLOC, + events[HTLB_BUDDY_PGALLOC]); + update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGFAIL, + events[HTLB_BUDDY_PGALLOC_FAIL]); +#endif #endif update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE, pages_to_bytes(i.freeram)); diff --git a/include/uapi/linux/virtio_balloon.h b/include/uapi/linux/virtio_balloon.h index 4e8b830..40297a3 100644 --- a/include/uapi/linux/virtio_balloon.h +++ b/include/uapi/linux/virtio_balloon.h @@ -53,7 +53,9 @@ struct virtio_balloon_config { #define VIRTIO_BALLOON_S_MEMTOT 5 /* Total amount of memory */ #define VIRTIO_BALLOON_S_AVAIL6 /* Available memory as in /proc */ #define VIRTIO_BALLOON_S_CACHES 7 /* Disk caches */ -#define VIRTIO_BALLOON_S_NR 8 +#define VIRTIO_BALLOON_S_HTLB_PGALLOC 8 /* Hugetlb page allocations */ +#define VIRTIO_BALLOON_S_HTLB_PGFAIL 9 /* Hugetlb page allocation failures */ +#define VIRTIO_BALLOON_S_NR 10 /* * Memory statistics structure. -- 1.8.3.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v5 0/7] jailhouse: Enhance secondary Jailhouse guest support /wrt PCI
Bjorn, On Mon, 19 Mar 2018, Bjorn Helgaas wrote: > On Wed, Mar 07, 2018 at 08:39:11AM +0100, Jan Kiszka wrote: > > Basic x86 support [1] for running Linux as secondary Jailhouse [2] guest > > is currently pending in the tip tree. This builds on top and enhances > > the PCI support for x86 and also ARM guests (ARM[64] does not require > > platform patches and works already). > > ... > > Hi Jan, > > What tree do you plan to merge this through? I was sort of assuming > x86, since the main point of jailhouse is not PCI, but I do notice > most of the changes here are PCI-related, so thought I should make > sure you're not waiting on me. I picked it up already. It's sitting in tip x86/platform Thanks, tglx ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: get_user_pages returning 0 (was Re: kernel BUG at drivers/vhost/vhost.c:LINE!)
On Mon, Mar 19, 2018 at 4:29 PM, David Sterbawrote: > On Mon, Mar 19, 2018 at 05:09:28PM +0200, Michael S. Tsirkin wrote: >> Hello! >> The following code triggered by syzbot >> >> r = get_user_pages_fast(log, 1, 1, ); >> if (r < 0) >> return r; >> BUG_ON(r != 1); >> >> Just looking at get_user_pages_fast's documentation this seems >> impossible - it is supposed to only ever return # of pages >> pinned or errno. >> >> However, poking at code, I see at least one path that might cause this: >> >> ret = faultin_page(tsk, vma, start, _flags, >> nonblocking); >> switch (ret) { >> case 0: >> goto retry; >> case -EFAULT: >> case -ENOMEM: >> case -EHWPOISON: >> return i ? i : ret; >> case -EBUSY: >> return i; >> >> which originally comes from: >> >> commit 53a7706d5ed8f1a53ba062b318773160cc476dde >> Author: Michel Lespinasse >> Date: Thu Jan 13 15:46:14 2011 -0800 >> >> mlock: do not hold mmap_sem for extended periods of time >> >> __get_user_pages gets a new 'nonblocking' parameter to signal that the >> caller is prepared to re-acquire mmap_sem and retry the operation if >> needed. This is used to split off long operations if they are going to >> block on a disk transfer, or when we detect contention on the mmap_sem. >> >> [a...@linux-foundation.org: remove ref to rwsem_is_contended()] >> Signed-off-by: Michel Lespinasse >> Cc: Hugh Dickins >> Cc: Rik van Riel >> Cc: Peter Zijlstra >> Cc: Nick Piggin >> Cc: KOSAKI Motohiro >> Cc: Ingo Molnar >> Cc: "H. Peter Anvin" >> Cc: Thomas Gleixner >> Cc: David Howells >> Signed-off-by: Andrew Morton >> Signed-off-by: Linus Torvalds >> >> I started looking into this, if anyone has any feedback meanwhile, >> that would be appreciated. >> >> In particular I don't really see why would this trigger >> on commit 8f5fd927c3a7576d57248a2d7a0861c3f2795973: >> >> Merge: 8757ae2 093e037 >> Author: Linus Torvalds >> Date: Fri Mar 16 13:37:42 2018 -0700 >> >> Merge tag 'for-4.16-rc5-tag' of >> git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux >> >> is btrfs used on these systems? > > There were 3 patches pulled by that tag, none of them is even remotely > related to the reported bug, AFAICS. If there's some impact, it must be > indirect, obvious bugs like NULL pointer would exhibit in a different > way and leave at least some trace in the stacks. That is just a commit on which the bug was hit. It's provided so that developers can make sense out of line numbers and check if the tree includes/not includes a particular commit, etc. It's not that that commit introduced the bug. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
get_user_pages returning 0 (was Re: kernel BUG at drivers/vhost/vhost.c:LINE!)
Hello! The following code triggered by syzbot r = get_user_pages_fast(log, 1, 1, ); if (r < 0) return r; BUG_ON(r != 1); Just looking at get_user_pages_fast's documentation this seems impossible - it is supposed to only ever return # of pages pinned or errno. However, poking at code, I see at least one path that might cause this: ret = faultin_page(tsk, vma, start, _flags, nonblocking); switch (ret) { case 0: goto retry; case -EFAULT: case -ENOMEM: case -EHWPOISON: return i ? i : ret; case -EBUSY: return i; which originally comes from: commit 53a7706d5ed8f1a53ba062b318773160cc476dde Author: Michel LespinasseDate: Thu Jan 13 15:46:14 2011 -0800 mlock: do not hold mmap_sem for extended periods of time __get_user_pages gets a new 'nonblocking' parameter to signal that the caller is prepared to re-acquire mmap_sem and retry the operation if needed. This is used to split off long operations if they are going to block on a disk transfer, or when we detect contention on the mmap_sem. [a...@linux-foundation.org: remove ref to rwsem_is_contended()] Signed-off-by: Michel Lespinasse Cc: Hugh Dickins Cc: Rik van Riel Cc: Peter Zijlstra Cc: Nick Piggin Cc: KOSAKI Motohiro Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Thomas Gleixner Cc: David Howells Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds I started looking into this, if anyone has any feedback meanwhile, that would be appreciated. In particular I don't really see why would this trigger on commit 8f5fd927c3a7576d57248a2d7a0861c3f2795973: Merge: 8757ae2 093e037 Author: Linus Torvalds Date: Fri Mar 16 13:37:42 2018 -0700 Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux is btrfs used on these systems? syzbot output below: Hello, syzbot hit the following crash on upstream commit 8f5fd927c3a7576d57248a2d7a0861c3f2795973 (Fri Mar 16 20:37:42 2018 +) Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux So far this crash happened 2 times on upstream. C reproducer is attached. syzkaller reproducer is attached. Raw console output is attached. compiler: gcc (GCC) 7.1.1 20170620 .config is attached. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+6304bf97ef436580f...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. audit: type=1400 audit(1521377060.016:6): avc: denied { map } for pid=4210 comm="bash" path="/bin/bash" dev="sda1" ino=1457 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tcontext=system_u:object_r:file_t:s0 tclass=file permissive=1 audit: type=1400 audit(1521377077.866:7): avc: denied { map } for pid=4228 comm="syzkaller050160" path="/root/syzkaller050160487" dev="sda1" ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 [ cut here ] kernel BUG at drivers/vhost/vhost.c:1655! invalid opcode: [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4228 Comm: syzkaller050160 Not tainted 4.16.0-rc5+ #357 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:set_bit_to_user drivers/vhost/vhost.c:1655 [inline] RIP: 0010:log_write+0x3ca/0x490 drivers/vhost/vhost.c:1679 RSP: 0018:8801b0fa77b0 EFLAGS: 00010293 RAX: 8801af534240 RBX: dc00 RCX: 8443f50a RDX: RSI: 0001 RDI: 8801af535618 RBP: 8801b0fa78f0 R08: 0040 R09: 0001 R10: 8801b0fa76d0 R11: 0002 R12: 0001 R13: ed00361f4f09 R14: 8801b0fa78c8 R15: 8801b0fa7848 FS: 007df880() GS:8801db30() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 20d7c000 CR3: 0001d3e5b005 CR4: 001606e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: vhost_update_used_flags+0x379/0x480
RE: [virtio-dev] [RFC] virtio-iommu version 0.6
> From: Jean-Philippe Brucker > Sent: Wednesday, February 7, 2018 2:11 AM > > Please find version 0.6 of the virtio-iommu specification at the following > locations. > > Document: http://jpbrucker.net/virtio-iommu/spec/virtio-iommu.pdf > http://jpbrucker.net/virtio-iommu/spec/virtio-iommu.html > > Sources: git://linux-arm.org/virtio-iommu.git viommu/v0.6 > > I didn't receive any comment for v0.5 [1], so this update is fairly light: > * Changed range definition in map and unmap requests > * Use virtio-iommu device ID > * Update IORT node ID and simplify layout > > The following document shows the difference between v0.5 and v0.6: > http://jpbrucker.net/virtio-iommu/spec/diffs/virtio-iommu-pdf-diff-v0.5- > v0.6.pdf > > Next version will hopefully add vSVA (PASID+PRI) and/or architectural > optimizations, but I can't provide a timeline yet. I'll probably send a > small draft first. > > I will send the Linux driver soon. In the meantime you can fetch it on my > development branches, based on next-20180206. > > git://linux-arm.org/linux-jpb.git virtio-iommu/devel > git://linux-arm.org/kvmtool-jpb.git virtio-iommu/devel > > Any comments welcome! > > Thanks, > Jean > > [1] https://www.spinics.net/lists/kvm/msg157402.html some comments here: -- BYPASS feature bit is not covered in "2.3.1/2.3.2/2.3.3"". Is it intended? --2.6.2 Device Requirements: Device operations-- "If the VIRTIO_IOMMU_F_INPUT_RANGE feature is offered, the device MUST truncate the range described by virt_start and virt_end in requests to ft in the range described by input_range." "If the VIRTIO_IOMMU_F_DOMAIN_BITS is offered, the device MUST ignore bits above domain_bits in field domain of requests." shouldn't device return error upon above situation instead of continuing operation with modified value? --2.6.4 DETACH request-- " Detach an endpoint from its domain. When this request completes, the endpoint cannot access any mapping from that domain anymore " Does it make sense to mention BYPASS situation upon this request? --2.6.8.2 Property RESV_MEM -- "VIRTIO_IOMMU_RESV_MEM_T_RESERVED (0) Accesses to virtual addresses in this region are not translated by the device. They may either be aborted by the device (or the underlying IOMMU), bypass it, or never even reach it" in 3.3 hardware device assignment you described an example that a reserved range is stolen by host to map the MSI doorbell... such map behavior is not described above. Then comes a question for VIRTIO_IOMMU_RESV_MEM_T_MSI. I know there were quite some discussion around this flag before, but my mental picture now still is a bit difficult to understand its usage based on examples in implementation notes: - for x86, you describe it as indicating MSI bypass for oxfeex. However guest doesn't need to know this fact. Only requirement is to treat it as reserved range (as on bare metal) then T_RESERVED is sufficient for this purpose - for ARM, either let guest or let host to choose a virtual address for mapping to MSI doorbell. the former doesn't require a reserved range. for the latter also T_RESERVED is enough as the example in hardware device assignment section. Then what's the real point of differentiating MSI bypass from normal reserved range? Is there other example where guest may use such info to do special thing? -- 3.2 Message Signaled Interrupts -- " Section 3.2.2 describes the added complexity (from the host point of view) of translating the IRQ chip doorbell " however later 3.2.2 only says one sentence about host complexity - " However, this mode of operations may add significant complexity in the host implementation". If you plan to elaborate that part in the future, please add a note. :-) Thanks Kevin ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization