Re: Linux 4.9.256

2021-02-09 Thread Avi Kivity
On 2/8/21 8:57 PM, Sasha Levin wrote: On Mon, Feb 08, 2021 at 05:50:21PM +0200, Avi Kivity wrote: On 05/02/2021 16.26, Greg Kroah-Hartman wrote: I'm announcing the release of the 4.9.256 kernel. This, and the 4.4.256 release are a little bit "different" than normal. This conta

Re: Linux 4.9.256

2021-02-08 Thread Avi Kivity
On 05/02/2021 16.26, Greg Kroah-Hartman wrote: I'm announcing the release of the 4.9.256 kernel. This, and the 4.4.256 release are a little bit "different" than normal. This contains only 1 patch, just the version bump from .255 to .256 which ends up causing the userspace-visable

Re: Spurious EIO on AIO+DIO+RWF_NOWAIT

2018-12-12 Thread Avi Kivity
On 12/10/18 2:48 PM, Goldwyn Rodrigues wrote: On 13:19 09/12, Avi Kivity wrote: I have an application that receives spurious EIO when running with RWF_NOWAIT enabled. Removing RWF_NOWAIT causes those EIOs to disappear. The application uses AIO+DIO, and errors were seen on both xfs and ext4. I

Re: Spurious EIO on AIO+DIO+RWF_NOWAIT

2018-12-10 Thread Avi Kivity
On 10/12/2018 14.48, Goldwyn Rodrigues wrote: On 13:19 09/12, Avi Kivity wrote: I have an application that receives spurious EIO when running with RWF_NOWAIT enabled. Removing RWF_NOWAIT causes those EIOs to disappear. The application uses AIO+DIO, and errors were seen on both xfs and ext4

Spurious EIO on AIO+DIO+RWF_NOWAIT

2018-12-09 Thread Avi Kivity
I have an application that receives spurious EIO when running with RWF_NOWAIT enabled. Removing RWF_NOWAIT causes those EIOs to disappear. The application uses AIO+DIO, and errors were seen on both xfs and ext4. I suspect the following code: /*  * Process one completed BIO.  No locks are

[PATCH v1] Revert "eventfd: only return events requested in poll_mask()"

2018-06-17 Thread Avi Kivity
This reverts commit 4d572d9f46507be8cfe326aa5bc3698babcbdfa7. It is superceded by the more general 2739b807b0885a09996659be82f813af219c7360 ("aio: only return events requested in poll_mask() for IOCB_CMD_POLL"). Unfortunately, hch nacked it on the bug report rather than on the patch itself, so it

[PATCH v1] Revert "eventfd: only return events requested in poll_mask()"

2018-06-17 Thread Avi Kivity
This reverts commit 4d572d9f46507be8cfe326aa5bc3698babcbdfa7. It is superceded by the more general 2739b807b0885a09996659be82f813af219c7360 ("aio: only return events requested in poll_mask() for IOCB_CMD_POLL"). Unfortunately, hch nacked it on the bug report rather than on the patch itself, so it

[PATCH v1] eventfd: only return events requested in poll_mask()

2018-06-08 Thread Avi Kivity
tfd is almost always ready for a write. Signed-off-by: Avi Kivity --- fs/eventfd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/eventfd.c b/fs/eventfd.c index 61c9514da5e9..ceb1031f1cac 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -154,15 +154,15 @@ static __pol

[PATCH v1] eventfd: only return events requested in poll_mask()

2018-06-08 Thread Avi Kivity
tfd is almost always ready for a write. Signed-off-by: Avi Kivity --- fs/eventfd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/eventfd.c b/fs/eventfd.c index 61c9514da5e9..ceb1031f1cac 100644 --- a/fs/eventfd.c +++ b/fs/eventfd.c @@ -154,15 +154,15 @@ static __pol

[PATCH v1] aio: mark __aio_sigset::sigmask const

2018-06-08 Thread Avi Kivity
io_pgetevents() will not change the signal mask. Mark it const to make it clear and to reduce the need for casts in user code. Signed-off-by: Avi Kivity --- include/uapi/linux/aio_abi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/aio_abi.h b/include

[PATCH v1] aio: mark __aio_sigset::sigmask const

2018-06-08 Thread Avi Kivity
io_pgetevents() will not change the signal mask. Mark it const to make it clear and to reduce the need for casts in user code. Signed-off-by: Avi Kivity --- include/uapi/linux/aio_abi.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/aio_abi.h b/include

Re: aio poll, io_pgetevents and a new in-kernel poll API V3

2018-01-18 Thread Avi Kivity
On 01/18/2018 07:51 PM, Avi Kivity wrote: On 01/18/2018 05:46 PM, Jeff Moyer wrote: FYI, this kernel has issues.  It will boot up, but I don't have networking, and even rebooting doesn't succeed.  I'm looking into it. FWIW, I'm running an older version of this patchset on my desktop

Re: aio poll, io_pgetevents and a new in-kernel poll API V3

2018-01-18 Thread Avi Kivity
On 01/18/2018 07:51 PM, Avi Kivity wrote: On 01/18/2018 05:46 PM, Jeff Moyer wrote: FYI, this kernel has issues.  It will boot up, but I don't have networking, and even rebooting doesn't succeed.  I'm looking into it. FWIW, I'm running an older version of this patchset on my desktop

Re: aio poll, io_pgetevents and a new in-kernel poll API V3

2018-01-18 Thread Avi Kivity
On 01/18/2018 05:46 PM, Jeff Moyer wrote: FYI, this kernel has issues. It will boot up, but I don't have networking, and even rebooting doesn't succeed. I'm looking into it. FWIW, I'm running an older version of this patchset on my desktop with no problems so far. -Jeff Christoph

Re: aio poll, io_pgetevents and a new in-kernel poll API V3

2018-01-18 Thread Avi Kivity
On 01/18/2018 05:46 PM, Jeff Moyer wrote: FYI, this kernel has issues. It will boot up, but I don't have networking, and even rebooting doesn't succeed. I'm looking into it. FWIW, I'm running an older version of this patchset on my desktop with no problems so far. -Jeff Christoph

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/07/2018 04:36 PM, Alan Cox wrote: I'm interested in participating to working on such a solution, given that haproxy is severely impacted by "pti=on" and that for now we'll have to run with "pti=off" on the whole system until a more suitable solution is found. I'm still trying to work

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/07/2018 04:36 PM, Alan Cox wrote: I'm interested in participating to working on such a solution, given that haproxy is severely impacted by "pti=on" and that for now we'll have to run with "pti=off" on the whole system until a more suitable solution is found. I'm still trying to work

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/07/2018 02:29 PM, Theodore Ts'o wrote: On Sun, Jan 07, 2018 at 11:16:28AM +0200, Avi Kivity wrote: I think capabilities will work just as well with cgroups. The container manager will set CAP_PAYLOAD to payload containers; and if those run an init system or a container manager

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/07/2018 02:29 PM, Theodore Ts'o wrote: On Sun, Jan 07, 2018 at 11:16:28AM +0200, Avi Kivity wrote: I think capabilities will work just as well with cgroups. The container manager will set CAP_PAYLOAD to payload containers; and if those run an init system or a container manager

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/06/2018 10:02 PM, Alan Cox wrote: I propose to create a new capability, CAP_PAYLOAD, that allows the system administrator to designate an application as the main workload in that system. Other processes (like sshd or monitoring daemons) exist to support it, and so it makes sense to protect

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/06/2018 10:02 PM, Alan Cox wrote: I propose to create a new capability, CAP_PAYLOAD, that allows the system administrator to designate an application as the main workload in that system. Other processes (like sshd or monitoring daemons) exist to support it, and so it makes sense to protect

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/06/2018 10:24 PM, Willy Tarreau wrote: Hi Avi, On Sat, Jan 06, 2018 at 09:33:28PM +0200, Avi Kivity wrote: Meltdown and Spectre mitigations focus on protecting the kernel from a hostile userspace. However, it's not a given that the kernel is the most important target in the system

Re: Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-07 Thread Avi Kivity
On 01/06/2018 10:24 PM, Willy Tarreau wrote: Hi Avi, On Sat, Jan 06, 2018 at 09:33:28PM +0200, Avi Kivity wrote: Meltdown and Spectre mitigations focus on protecting the kernel from a hostile userspace. However, it's not a given that the kernel is the most important target in the system

Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-06 Thread Avi Kivity
Meltdown and Spectre mitigations focus on protecting the kernel from a hostile userspace. However, it's not a given that the kernel is the most important target in the system. It is common in server workloads that a single userspace application contains the valuable data on a system, and if it

Proposal: CAP_PAYLOAD to reduce Meltdown and Spectre mitigation costs

2018-01-06 Thread Avi Kivity
Meltdown and Spectre mitigations focus on protecting the kernel from a hostile userspace. However, it's not a given that the kernel is the most important target in the system. It is common in server workloads that a single userspace application contains the valuable data on a system, and if it

Re: Detecting RWF_NOWAIT support

2017-12-17 Thread Avi Kivity
On 12/18/2017 05:28 AM, Goldwyn Rodrigues wrote: On 12/16/2017 08:49 AM, Avi Kivity wrote: On 12/14/2017 09:15 PM, Goldwyn Rodrigues wrote: On 12/14/2017 11:38 AM, Avi Kivity wrote: I'm looking to add support for RWF_NOWAIT within a linux-aio iocb. Naturally, I need to detect at runtime

Re: Detecting RWF_NOWAIT support

2017-12-17 Thread Avi Kivity
On 12/18/2017 05:28 AM, Goldwyn Rodrigues wrote: On 12/16/2017 08:49 AM, Avi Kivity wrote: On 12/14/2017 09:15 PM, Goldwyn Rodrigues wrote: On 12/14/2017 11:38 AM, Avi Kivity wrote: I'm looking to add support for RWF_NOWAIT within a linux-aio iocb. Naturally, I need to detect at runtime

Re: Detecting RWF_NOWAIT support

2017-12-16 Thread Avi Kivity
On 12/16/2017 08:12 PM, vcap...@pengaru.com wrote: On Sat, Dec 16, 2017 at 10:03:38AM -0800, vcap...@pengaru.com wrote: On Sat, Dec 16, 2017 at 04:49:08PM +0200, Avi Kivity wrote: On 12/14/2017 09:15 PM, Goldwyn Rodrigues wrote: On 12/14/2017 11:38 AM, Avi Kivity wrote: I'm looking to add

Re: Detecting RWF_NOWAIT support

2017-12-16 Thread Avi Kivity
On 12/16/2017 08:12 PM, vcap...@pengaru.com wrote: On Sat, Dec 16, 2017 at 10:03:38AM -0800, vcap...@pengaru.com wrote: On Sat, Dec 16, 2017 at 04:49:08PM +0200, Avi Kivity wrote: On 12/14/2017 09:15 PM, Goldwyn Rodrigues wrote: On 12/14/2017 11:38 AM, Avi Kivity wrote: I'm looking to add

Re: Detecting RWF_NOWAIT support

2017-12-16 Thread Avi Kivity
On 12/14/2017 09:15 PM, Goldwyn Rodrigues wrote: On 12/14/2017 11:38 AM, Avi Kivity wrote: I'm looking to add support for RWF_NOWAIT within a linux-aio iocb. Naturally, I need to detect at runtime whether the kernel support RWF_NOWAIT or not. The only method I could find was to issue an I

Re: Detecting RWF_NOWAIT support

2017-12-16 Thread Avi Kivity
On 12/14/2017 09:15 PM, Goldwyn Rodrigues wrote: On 12/14/2017 11:38 AM, Avi Kivity wrote: I'm looking to add support for RWF_NOWAIT within a linux-aio iocb. Naturally, I need to detect at runtime whether the kernel support RWF_NOWAIT or not. The only method I could find was to issue an I

Detecting RWF_NOWAIT support

2017-12-14 Thread Avi Kivity
I'm looking to add support for RWF_NOWAIT within a linux-aio iocb. Naturally, I need to detect at runtime whether the kernel support RWF_NOWAIT or not. The only method I could find was to issue an I/O with RWF_NOWAIT set, and look for errors. This is somewhat less than perfect:  - from the

Detecting RWF_NOWAIT support

2017-12-14 Thread Avi Kivity
I'm looking to add support for RWF_NOWAIT within a linux-aio iocb. Naturally, I need to detect at runtime whether the kernel support RWF_NOWAIT or not. The only method I could find was to issue an I/O with RWF_NOWAIT set, and look for errors. This is somewhat less than perfect:  - from the

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

2017-11-14 Thread Avi Kivity
On 11/14/2017 06:49 PM, Mathieu Desnoyers wrote: - On Nov 14, 2017, at 11:08 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Nov 14, 2017 at 05:05:41PM +0100, Peter Zijlstra wrote: On Tue, Nov 14, 2017 at 03:17:12PM +, Mathieu Desnoyers wrote: I've tried to create a small

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

2017-11-14 Thread Avi Kivity
On 11/14/2017 06:49 PM, Mathieu Desnoyers wrote: - On Nov 14, 2017, at 11:08 AM, Peter Zijlstra pet...@infradead.org wrote: On Tue, Nov 14, 2017 at 05:05:41PM +0100, Peter Zijlstra wrote: On Tue, Nov 14, 2017 at 03:17:12PM +, Mathieu Desnoyers wrote: I've tried to create a small

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

2017-11-14 Thread Avi Kivity
On 11/14/2017 05:17 PM, Mathieu Desnoyers wrote: - On Nov 14, 2017, at 9:53 AM, Avi Kivity a...@scylladb.com wrote: On 11/13/2017 06:56 PM, Mathieu Desnoyers wrote: - On Nov 10, 2017, at 4:57 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: - On Nov 10, 2017, at 4

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

2017-11-14 Thread Avi Kivity
On 11/14/2017 05:17 PM, Mathieu Desnoyers wrote: - On Nov 14, 2017, at 9:53 AM, Avi Kivity a...@scylladb.com wrote: On 11/13/2017 06:56 PM, Mathieu Desnoyers wrote: - On Nov 10, 2017, at 4:57 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: - On Nov 10, 2017, at 4

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

2017-11-14 Thread Avi Kivity
On 11/13/2017 06:56 PM, Mathieu Desnoyers wrote: - On Nov 10, 2017, at 4:57 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: - On Nov 10, 2017, at 4:36 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, Nov 10, 2017 at 1:12 PM, Mathieu Desnoyers

Re: [RFC PATCH 0/2] x86: Fix missing core serialization on migration

2017-11-14 Thread Avi Kivity
On 11/13/2017 06:56 PM, Mathieu Desnoyers wrote: - On Nov 10, 2017, at 4:57 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: - On Nov 10, 2017, at 4:36 PM, Linus Torvalds torva...@linux-foundation.org wrote: On Fri, Nov 10, 2017 at 1:12 PM, Mathieu Desnoyers wrote:

Re: [PATCH tip/core/rcu 1/3] membarrier: Provide register expedited private command

2017-10-05 Thread Avi Kivity
On 10/05/2017 07:23 AM, Nicholas Piggin wrote: On Wed, 4 Oct 2017 14:37:53 -0700 "Paul E. McKenney" wrote: From: Mathieu Desnoyers Provide a new command allowing processes to register their intent to use the private expedited

Re: [PATCH tip/core/rcu 1/3] membarrier: Provide register expedited private command

2017-10-05 Thread Avi Kivity
On 10/05/2017 07:23 AM, Nicholas Piggin wrote: On Wed, 4 Oct 2017 14:37:53 -0700 "Paul E. McKenney" wrote: From: Mathieu Desnoyers Provide a new command allowing processes to register their intent to use the private expedited command. This allows PowerPC to skip the full memory barrier

Re: [RFC PATCH v2] membarrier: expedited private command

2017-08-01 Thread Avi Kivity
On 08/01/2017 01:22 PM, Peter Zijlstra wrote: If mm cpumask is used, I think it's okay. You can cause quite similar kind of iteration over CPUs and lots of IPIs, tlb flushes, etc using munmap/mprotect/etc, or context switch IPIs, etc. Are we reaching the stage where we're controlling those

Re: [RFC PATCH v2] membarrier: expedited private command

2017-08-01 Thread Avi Kivity
On 08/01/2017 01:22 PM, Peter Zijlstra wrote: If mm cpumask is used, I think it's okay. You can cause quite similar kind of iteration over CPUs and lots of IPIs, tlb flushes, etc using munmap/mprotect/etc, or context switch IPIs, etc. Are we reaching the stage where we're controlling those

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-31 Thread Avi Kivity
On 07/31/2017 11:37 AM, Peter Zijlstra wrote: On Mon, Jul 31, 2017 at 09:03:09AM +0300, Avi Kivity wrote: I remembered that current->mm does not change when switching to a kernel task, but my Kernlish is very rusty, or maybe it has changed. kernel threads do indeed preserve the mm of the

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-31 Thread Avi Kivity
On 07/31/2017 11:37 AM, Peter Zijlstra wrote: On Mon, Jul 31, 2017 at 09:03:09AM +0300, Avi Kivity wrote: I remembered that current->mm does not change when switching to a kernel task, but my Kernlish is very rusty, or maybe it has changed. kernel threads do indeed preserve the mm of the

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-31 Thread Avi Kivity
On 07/28/2017 12:02 AM, Mathieu Desnoyers wrote: - On Jul 27, 2017, at 4:58 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: - On Jul 27, 2017, at 4:37 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Thu, Jul 27, 2017 at 11:04:13PM +0300, Avi Kivity wrote

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-31 Thread Avi Kivity
On 07/28/2017 12:02 AM, Mathieu Desnoyers wrote: - On Jul 27, 2017, at 4:58 PM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: - On Jul 27, 2017, at 4:37 PM, Paul E. McKenney paul...@linux.vnet.ibm.com wrote: On Thu, Jul 27, 2017 at 11:04:13PM +0300, Avi Kivity wrote

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-27 Thread Avi Kivity
On 07/27/2017 10:43 PM, Paul E. McKenney wrote: On Thu, Jul 27, 2017 at 10:20:14PM +0300, Avi Kivity wrote: On 07/27/2017 09:12 PM, Paul E. McKenney wrote: Hello! Please see below for a prototype sys_membarrier() speedup patch. Please note that there is some controversy on this subject, so

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-27 Thread Avi Kivity
On 07/27/2017 10:43 PM, Paul E. McKenney wrote: On Thu, Jul 27, 2017 at 10:20:14PM +0300, Avi Kivity wrote: On 07/27/2017 09:12 PM, Paul E. McKenney wrote: Hello! Please see below for a prototype sys_membarrier() speedup patch. Please note that there is some controversy on this subject, so

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-27 Thread Avi Kivity
nted out by Boqun Feng. ] Tested-by: Avi Kivity <a...@scylladb.com> Cc: Maged Michael <maged.mich...@gmail.com> Cc: Andrew Hunter <a...@google.com> Cc: Geoffrey Romer <gro...@google.com> diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membar

Re: Udpated sys_membarrier() speedup patch, FYI

2017-07-27 Thread Avi Kivity
ier() call happens within the same jiffy, all but the first will use synchronize_sched() instead of synchronize_sched_expedited(). Signed-off-by: Paul E. McKenney [ paulmck: Fix code style issue pointed out by Boqun Feng. ] Tested-by: Avi Kivity Cc: Maged Michae

Re: MAP_POPULATE vs. MADV_HUGEPAGES

2017-03-16 Thread Avi Kivity
On 03/16/2017 04:48 PM, Michal Hocko wrote: On Thu 16-03-17 15:26:54, Avi Kivity wrote: On 03/16/2017 02:34 PM, Michal Hocko wrote: On Wed 15-03-17 18:50:32, Avi Kivity wrote: A user is trying to allocate 1TB of anonymous memory in parallel on 48 cores (4 NUMA nodes). The kernel ends up

Re: MAP_POPULATE vs. MADV_HUGEPAGES

2017-03-16 Thread Avi Kivity
On 03/16/2017 04:48 PM, Michal Hocko wrote: On Thu 16-03-17 15:26:54, Avi Kivity wrote: On 03/16/2017 02:34 PM, Michal Hocko wrote: On Wed 15-03-17 18:50:32, Avi Kivity wrote: A user is trying to allocate 1TB of anonymous memory in parallel on 48 cores (4 NUMA nodes). The kernel ends up

Re: MAP_POPULATE vs. MADV_HUGEPAGES

2017-03-16 Thread Avi Kivity
On 03/16/2017 02:34 PM, Michal Hocko wrote: On Wed 15-03-17 18:50:32, Avi Kivity wrote: A user is trying to allocate 1TB of anonymous memory in parallel on 48 cores (4 NUMA nodes). The kernel ends up spinning in isolate_freepages_block(). Which kernel version is that? A good question

Re: MAP_POPULATE vs. MADV_HUGEPAGES

2017-03-16 Thread Avi Kivity
On 03/16/2017 02:34 PM, Michal Hocko wrote: On Wed 15-03-17 18:50:32, Avi Kivity wrote: A user is trying to allocate 1TB of anonymous memory in parallel on 48 cores (4 NUMA nodes). The kernel ends up spinning in isolate_freepages_block(). Which kernel version is that? A good question

MAP_POPULATE vs. MADV_HUGEPAGES

2017-03-15 Thread Avi Kivity
A user is trying to allocate 1TB of anonymous memory in parallel on 48 cores (4 NUMA nodes). The kernel ends up spinning in isolate_freepages_block(). I thought to help it along by using MAP_POPULATE, but then my MADV_HUGEPAGE won't be seen until after mmap() completes, with pages already

MAP_POPULATE vs. MADV_HUGEPAGES

2017-03-15 Thread Avi Kivity
A user is trying to allocate 1TB of anonymous memory in parallel on 48 cores (4 NUMA nodes). The kernel ends up spinning in isolate_freepages_block(). I thought to help it along by using MAP_POPULATE, but then my MADV_HUGEPAGE won't be seen until after mmap() completes, with pages already

Re: [PATCH] vfio: Include No-IOMMU mode

2015-11-16 Thread Avi Kivity
On 11/16/2015 07:06 PM, Alex Williamson wrote: On Wed, 2015-10-28 at 15:21 -0600, Alex Williamson wrote: There is really no way to safely give a user full access to a DMA capable device without an IOMMU to protect the host system. There is also no way to provide DMA translation, for use cases

Re: [PATCH] vfio: Include No-IOMMU mode

2015-11-16 Thread Avi Kivity
On 11/16/2015 07:06 PM, Alex Williamson wrote: On Wed, 2015-10-28 at 15:21 -0600, Alex Williamson wrote: There is really no way to safely give a user full access to a DMA capable device without an IOMMU to protect the host system. There is also no way to provide DMA translation, for use cases

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Avi Kivity
On 10/12/2015 07:23 PM, Alex Williamson wrote: Also, although you think the long option will set the bar high enough it probably will not satisfy anyone. It is annoying enough, that I would just carry a patch to remove it the silly requirement. And the the people who believe all user mode DMA

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-12 Thread Avi Kivity
On 10/12/2015 07:23 PM, Alex Williamson wrote: Also, although you think the long option will set the bar high enough it probably will not satisfy anyone. It is annoying enough, that I would just carry a patch to remove it the silly requirement. And the the people who believe all user mode DMA

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-11 Thread Avi Kivity
On 10/11/2015 11:57 AM, Michael S. Tsirkin wrote: On Sun, Oct 11, 2015 at 11:12:14AM +0300, Avi Kivity wrote: Mixing no-iommu and secure VFIO is also unsupported, as are any VFIO IOMMU backends other than the vfio-noiommu backend. Furthermore, unsafe group files are relocated to /dev/vfio

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-11 Thread Avi Kivity
On 10/09/2015 09:41 PM, Alex Williamson wrote: There is really no way to safely give a user full access to a PCI without an IOMMU to protect the host from errant DMA. There is also no way to provide DMA translation, for use cases such as devices assignment to virtual machines. However, there

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-11 Thread Avi Kivity
On 10/09/2015 09:41 PM, Alex Williamson wrote: There is really no way to safely give a user full access to a PCI without an IOMMU to protect the host from errant DMA. There is also no way to provide DMA translation, for use cases such as devices assignment to virtual machines. However, there

Re: [RFC PATCH 2/2] vfio: Include no-iommu mode

2015-10-11 Thread Avi Kivity
On 10/11/2015 11:57 AM, Michael S. Tsirkin wrote: On Sun, Oct 11, 2015 at 11:12:14AM +0300, Avi Kivity wrote: Mixing no-iommu and secure VFIO is also unsupported, as are any VFIO IOMMU backends other than the vfio-noiommu backend. Furthermore, unsafe group files are relocated to /dev/vfio

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 01:26 PM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 12:19:20PM +0300, Avi Kivity wrote: On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: On 08/10/15 00:05, Michael S. Tsirkin wrote: On Wed, Oct 07, 2015 at 07

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote: On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: It is good practice to defend against root oopsing the kernel

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: On 08/10/15 00:05, Michael S. Tsirkin wrote: On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote: That's what I thought as well, but apparently adding msix support

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: It is good practice to defend against root oopsing the kernel, but in some cases it cannot be achieved. Absolutely. That's one of the issues with these patches. They don't even try

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: On 08/10/15 00:05, Michael S. Tsirkin wrote: On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote: That's what I thought as well, but apparently adding msix support

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: It is good practice to defend against root oopsing the kernel, but in some cases it cannot be achieved. Absolutely. That's one of the issues with these patches. They don't even try

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 12:16 PM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 11:46:30AM +0300, Avi Kivity wrote: On 10/08/2015 10:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: It is good practice to defend against root oopsing the kernel

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-08 Thread Avi Kivity
On 10/08/2015 01:26 PM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 12:19:20PM +0300, Avi Kivity wrote: On 10/08/2015 11:32 AM, Michael S. Tsirkin wrote: On Thu, Oct 08, 2015 at 08:33:45AM +0300, Avi Kivity wrote: On 08/10/15 00:05, Michael S. Tsirkin wrote: On Wed, Oct 07, 2015 at 07

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-07 Thread Avi Kivity
On 08/10/15 00:05, Michael S. Tsirkin wrote: On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote: That's what I thought as well, but apparently adding msix support to the already insecure uio drivers is even worse. I'm glad you finally agree what these drivers are doing is insecure

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-07 Thread Avi Kivity
On 10/07/2015 07:31 PM, Alex Williamson wrote: I guess the no-iommu map would error if the IOVA isn't simply the bus address of the page mapped. Of course this is entirely unsafe and this no-iommu driver should taint the kernel, but it at least standardizes on one userspace API and you're

Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

2015-10-07 Thread Avi Kivity
On 10/07/2015 01:25 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 07:09:11PM +0300, Avi Kivity wrote: On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote: While it is possible that userspace malfunctions and accidentally programs MSI incorrectly, the risk is dwarfed by the ability

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-07 Thread Avi Kivity
On 10/06/2015 09:51 PM, Alex Williamson wrote: On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote: On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote: The only "like VFIO" behavior we implement here is binding

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-07 Thread Avi Kivity
On 10/06/2015 09:51 PM, Alex Williamson wrote: On Tue, 2015-10-06 at 18:23 +0300, Avi Kivity wrote: On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote: The only "like VFIO" behavior we implement here is binding

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-07 Thread Avi Kivity
On 08/10/15 00:05, Michael S. Tsirkin wrote: On Wed, Oct 07, 2015 at 07:39:16PM +0300, Avi Kivity wrote: That's what I thought as well, but apparently adding msix support to the already insecure uio drivers is even worse. I'm glad you finally agree what these drivers are doing is insecure

Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

2015-10-07 Thread Avi Kivity
On 10/07/2015 01:25 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 07:09:11PM +0300, Avi Kivity wrote: On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote: While it is possible that userspace malfunctions and accidentally programs MSI incorrectly, the risk is dwarfed by the ability

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-07 Thread Avi Kivity
On 10/07/2015 07:31 PM, Alex Williamson wrote: I guess the no-iommu map would error if the IOVA isn't simply the bus address of the page mapped. Of course this is entirely unsafe and this no-iommu driver should taint the kernel, but it at least standardizes on one userspace API and you're

Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

2015-10-06 Thread Avi Kivity
On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote: While it is possible that userspace malfunctions and accidentally programs MSI incorrectly, the risk is dwarfed by the ability of userspace to program DMA incorrectly. That seems to imply that for the upstream kernel this is not a valid

Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:07 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 03:15:57PM +0300, Avi Kivity wrote: btw, (2) doesn't really add any insecurity. The user could already poke at the msix tables (as well as perform DMA); they just couldn't get a useful interrupt out of them. Poking

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:46 PM, Michael S. Tsirkin wrote: On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote: Eventfd is a natural enough representation of an interrupt; both kvm and vfio use it, and are also able to share the eventfd, allowing a vfio interrupt to generate a kvm interrupt

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote: The only "like VFIO" behavior we implement here is binding the MSI-X interrupt notification to eventfd descriptor. There will be more if you add some basic memory protections.

Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:30 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote: Bus mastering is easily enabled from the user space (taken from DPDK code): static int pci_uio_set_bus_master(int dev_fd) { uint16_t reg; int ret; ret =

Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X

2015-10-06 Thread Avi Kivity
On 10/06/2015 10:33 AM, Stephen Hemminger wrote: Other than implementation objections, so far the two main arguments against this reduce to: 1. If you allow UIO ioctl then it opens an API hook for all the crap out of tree UIO drivers to do what they want. 2. If you allow UIO MSI-X

Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:30 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 11:37:59AM +0300, Vlad Zolotarov wrote: Bus mastering is easily enabled from the user space (taken from DPDK code): static int pci_uio_set_bus_master(int dev_fd) { uint16_t reg; int ret; ret =

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:56 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 05:43:50PM +0300, Vlad Zolotarov wrote: The only "like VFIO" behavior we implement here is binding the MSI-X interrupt notification to eventfd descriptor. There will be more if you add some basic memory protections.

Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:07 PM, Michael S. Tsirkin wrote: On Tue, Oct 06, 2015 at 03:15:57PM +0300, Avi Kivity wrote: btw, (2) doesn't really add any insecurity. The user could already poke at the msix tables (as well as perform DMA); they just couldn't get a useful interrupt out of them. Poking

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-06 Thread Avi Kivity
On 10/06/2015 05:46 PM, Michael S. Tsirkin wrote: On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote: Eventfd is a natural enough representation of an interrupt; both kvm and vfio use it, and are also able to share the eventfd, allowing a vfio interrupt to generate a kvm interrupt

Re: [dpdk-dev] [PATCH 2/2] uio: new driver to support PCI MSI-X

2015-10-06 Thread Avi Kivity
On 10/06/2015 10:33 AM, Stephen Hemminger wrote: Other than implementation objections, so far the two main arguments against this reduce to: 1. If you allow UIO ioctl then it opens an API hook for all the crap out of tree UIO drivers to do what they want. 2. If you allow UIO MSI-X

Re: [PATCH v3 0/3] uio: add MSI/MSI-X support to uio_pci_generic driver

2015-10-06 Thread Avi Kivity
On 10/06/2015 06:15 PM, Michael S. Tsirkin wrote: While it is possible that userspace malfunctions and accidentally programs MSI incorrectly, the risk is dwarfed by the ability of userspace to program DMA incorrectly. That seems to imply that for the upstream kernel this is not a valid

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 02:41 PM, Vlad Zolotarov wrote: On 10/05/15 13:57, Greg KH wrote: On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote: On 10/05/15 10:56, Greg KH wrote: On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote: +struct msix_info { +int num_irqs; +

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 01:57 PM, Greg KH wrote: On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote: On 10/05/15 10:56, Greg KH wrote: On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote: +struct msix_info { + int num_irqs; + struct msix_entry *table; + struct

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 12:49 PM, Greg KH wrote: On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote: Of course it has to be documented, but this just follows vfio. Eventfd is a natural enough representation of an interrupt; both kvm and vfio use it, and are also able to share the eventfd

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 06:11 AM, Greg KH wrote: On Sun, Oct 04, 2015 at 11:43:17PM +0300, Vlad Zolotarov wrote: Add support for MSI and MSI-X interrupt modes: - Interrupt mode selection order is: INT#X (for backward compatibility) -> MSI-X -> MSI. - Add ioctl() commands: -

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 06:11 AM, Greg KH wrote: On Sun, Oct 04, 2015 at 11:43:17PM +0300, Vlad Zolotarov wrote: Add support for MSI and MSI-X interrupt modes: - Interrupt mode selection order is: INT#X (for backward compatibility) -> MSI-X -> MSI. - Add ioctl() commands: -

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 12:49 PM, Greg KH wrote: On Mon, Oct 05, 2015 at 11:28:03AM +0300, Avi Kivity wrote: Of course it has to be documented, but this just follows vfio. Eventfd is a natural enough representation of an interrupt; both kvm and vfio use it, and are also able to share the eventfd

Re: [PATCH v3 2/3] uio_pci_generic: add MSI/MSI-X support

2015-10-05 Thread Avi Kivity
On 10/05/2015 02:41 PM, Vlad Zolotarov wrote: On 10/05/15 13:57, Greg KH wrote: On Mon, Oct 05, 2015 at 01:48:39PM +0300, Vlad Zolotarov wrote: On 10/05/15 10:56, Greg KH wrote: On Mon, Oct 05, 2015 at 10:41:39AM +0300, Vlad Zolotarov wrote: +struct msix_info { +int num_irqs; +

  1   2   3   4   5   6   7   8   9   10   >