Re: [Qemu-ppc] KVM memory slots limit on powerpc

2015-09-04 Thread Benjamin Herrenschmidt
On Fri, 2015-09-04 at 12:28 +0200, Thomas Huth wrote: > > Maybe some rcu protected scheme that doubles the amount of memslots > > for > > each overrun? Yes, that would be good and even reduce the footprint > > for > > systems with only a small number of memslots. > > Seems like Alex Williamson

Re: [PATCH] KVM: ppc: Fix size of the PSPB register

2015-09-01 Thread Benjamin Herrenschmidt
On Wed, 2015-09-02 at 08:24 +1000, Paul Mackerras wrote: > On Tue, Sep 01, 2015 at 11:41:18PM +0200, Thomas Huth wrote: > > The size of the Problem State Priority Boost Register is only > > 32 bits, so let's change the type of the corresponding variable > > accordingly to avoid future trouble. >

Re: [PATCH] KVM: ppc: Fix size of the PSPB register

2015-09-01 Thread Benjamin Herrenschmidt
On Wed, 2015-09-02 at 08:45 +1000, Paul Mackerras wrote: > On Wed, Sep 02, 2015 at 08:25:05AM +1000, Benjamin Herrenschmidt > wrote: > > On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote: > > > The size of the Problem State Priority Boost Register is only > > > 32

Re: [PATCH] KVM: ppc: Fix size of the PSPB register

2015-09-01 Thread Benjamin Herrenschmidt
On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote: > The size of the Problem State Priority Boost Register is only > 32 bits, so let's change the type of the corresponding variable > accordingly to avoid future trouble. It's not future trouble, it's broken today for LE and this should fix it

Re: BUG: sleeping function called from ras_epow_interrupt context

2015-07-14 Thread Benjamin Herrenschmidt
On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote: Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use mdelay() instead of msleep() in rtas_busy_delay()? Something more fancy? A proper fix would be more fancy, the get_sensor should happen in a kernel thread instead. Cheers,

Re: [PATCH 06/25] powerpc: Use bool function return values of true/false not 1/0

2015-03-30 Thread Benjamin Herrenschmidt
On Mon, 2015-03-30 at 16:46 -0700, Joe Perches wrote: Use the normal return values for bool functions Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org Should we merge it or will you ? Cheers, Ben. Signed-off-by: Joe Perches j...@perches.com --- arch/powerpc/include/asm/dcr

Re: [PATCH 2/2] KVM: PPC: Remove page table walk helpers

2015-03-29 Thread Benjamin Herrenschmidt
On Mon, 2015-03-30 at 10:39 +0530, Aneesh Kumar K.V wrote: This patch remove helpers which we had used only once in the code. Limiting page table walk variants help in ensuring that we won't end up with code walking page table with wrong assumptions. Signed-off-by: Aneesh Kumar K.V

Re: [PATCH v3] powerpc/kvm: support to handle sw breakpoint

2014-08-11 Thread Benjamin Herrenschmidt
On Mon, 2014-08-11 at 09:26 +0200, Alexander Graf wrote: diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index da86d9b..d95014e 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c This should be book3s_emulate.c. Any reason we can't make that

Re: [PULL 16/63] PPC: Add asm helpers for BE 32bit load/store

2014-08-01 Thread Benjamin Herrenschmidt
-by: Alexander Graf ag...@suse.de Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org --- arch/powerpc/include/asm/asm-compat.h | 4 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/include/asm/asm-compat.h b/arch/powerpc/include/asm/asm-compat.h index 4b237aa..21be8ae 100644

Re: [PATCH] powerpc: kvm: make the setup of hpte under the protection of KVMPPC_RMAP_LOCK_BIT

2014-07-29 Thread Benjamin Herrenschmidt
is that it uses kvm_unmap_rmapp() which will also lock the HPTE (try_lock_hpte) and so shouldn't have a race vs the above code. Or do you see a race I don't ? Cheers, Ben. Thx. Fan On Mon, Jul 28, 2014 at 2:42 PM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Mon, 2014-07-28 at 14:09

Re: [PATCH] powerpc: kvm: make the setup of hpte under the protection of KVMPPC_RMAP_LOCK_BIT

2014-07-28 Thread Benjamin Herrenschmidt
On Mon, 2014-07-28 at 14:09 +0800, Liu Ping Fan wrote: In current code, the setup of hpte is under the risk of race with mmu_notifier_invalidate, i.e we may setup a hpte with a invalid pfn. Resolve this issue by sync the two actions by KVMPPC_RMAP_LOCK_BIT. Please describe the race you think

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-07-21 Thread Benjamin Herrenschmidt
On Mon, 2014-07-21 at 16:06 +0800, Mike Qiu wrote: I don't like this. I much prefer have dedicated error injection files in their respective locations, something for PCI under the corresponding PCI bridge etc... So PowerNV error injection will be designed rely on debugfs been

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-07-21 Thread Benjamin Herrenschmidt
On Tue, 2014-07-22 at 11:10 +0800, Mike Qiu wrote: On 07/22/2014 06:49 AM, Benjamin Herrenschmidt wrote: On Mon, 2014-07-21 at 16:06 +0800, Mike Qiu wrote: I don't like this. I much prefer have dedicated error injection files in their respective locations, something for PCI under

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-06-24 Thread Benjamin Herrenschmidt
Is it reasonable to do error injection with CONFIG_IOMMU_API ? That means if use default config(CONFIG_IOMMU_API = n), we can not do error injection to pci devices? Well we can't pass them through either so ... In any case, this is not a priority. First we need to implement a solid error

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-06-24 Thread Benjamin Herrenschmidt
On Tue, 2014-06-24 at 14:57 +0800, Mike Qiu wrote: Is that mean *host* side error injection should base on CONFIG_IOMMU_API ? If it is just host side(no guest, no pass through), can't we do error inject? Maybe I misunderstand :) Ah no, make different patches, we don't want to use IOMMU

Re: [PATCH v1 2/3] powerpc/powernv: Support PCI error injection

2014-06-24 Thread Benjamin Herrenschmidt
On Wed, 2014-06-25 at 11:05 +0800, Mike Qiu wrote: Here maybe /sys/kernel/debug/powerpc/errinjct is better, because it will supply PCI_domain_nr in parameters, so no need supply errinjct for each PCI domain. Another reason is error inject not only for PCI(in future), so better not in

Re: [PATCH v1 1/3] powerpc/powernv: Sync header with firmware

2014-06-23 Thread Benjamin Herrenschmidt
On Mon, 2014-06-23 at 12:14 +1000, Gavin Shan wrote: The patch synchronizes firmware header file (opal.h) for PCI error injection The FW API you expose is not PCI specific. I haven't seen the corresponding FW patches yet but I'm not fan of that single call that collates unrelated things. I

Re: [PATCH] powerpc/kvm: support to handle sw breakpoint

2014-06-17 Thread Benjamin Herrenschmidt
On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote: On 17.06.14 11:22, Benjamin Herrenschmidt wrote: On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote: Also, why don't we use twi always or something else that actually is defined as illegal instruction? I would like to see

Re: [PATCH v8 2/3] powerpc/eeh: EEH support for VFIO PCI device

2014-06-05 Thread Benjamin Herrenschmidt
On Thu, 2014-06-05 at 16:36 +1000, Gavin Shan wrote: +#define EEH_OPT_GET_PE_ADDR0 /* Get PE addr */ +#define EEH_OPT_GET_PE_MODE1 /* Get PE mode */ I assume that's just some leftover from the previous patches :-) Don't respin just yet, let's see what other comments come

Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows

2014-06-05 Thread Benjamin Herrenschmidt
On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote: +This creates a virtual TCE (translation control entry) table, which +is an IOMMU for PAPR-style virtual I/O. It is used to translate +logical addresses used in virtual I/O into guest physical addresses, +and provides a

Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows

2014-06-05 Thread Benjamin Herrenschmidt
On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote: No trees yet. For 64GB window we need (6430)/(1620)*8 = 32K TCE table. Do we really need trees? The above is assuming hugetlbfs backed guests. These are the least of my worry indeed. But we need to deal with 4k and 64k guests.

Re: [PATCH 3/3] PPC: KVM: Add support for 64bit TCE windows

2014-06-05 Thread Benjamin Herrenschmidt
On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote: What if we ask user space to give us a pointer to user space allocated memory along with the TCE registration? We would still ask user space to only use the returned fd for TCE modifications, but would have some nicely swappable

Re: [PATCH 4/4] powerpc/eeh: Avoid event on passed PE

2014-06-03 Thread Benjamin Herrenschmidt
On Tue, 2014-06-03 at 09:45 +0200, Alexander Graf wrote: For EEH it could as well be a dumb eventfd - really just a side channel that can tell user space that something happened asynchronously :). Which the host kernel may have no way to detect without actively poking at the device (fences in

Re: powerpc/pseries: Use new defines when calling h_set_mode

2014-05-29 Thread Benjamin Herrenschmidt
On Thu, 2014-05-29 at 23:27 +0200, Alexander Graf wrote: On 29.05.14 09:45, Michael Neuling wrote: +/* Values for 2nd argument to H_SET_MODE */ +#define H_SET_MODE_RESOURCE_SET_CIABR1 +#define H_SET_MODE_RESOURCE_SET_DAWR2 +#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3

Re: [PATCH v7 3/3] drivers/vfio: EEH support for VFIO PCI device

2014-05-28 Thread Benjamin Herrenschmidt
On Wed, 2014-05-28 at 22:49 +1000, Gavin Shan wrote: I will remove those address related macros in next revision because it's user-level bussiness, not related to host kernel any more. If the user is QEMU + guest, we need the address to identify the PE though PHB BUID could be used as same

Re: [PATCH v7 3/3] drivers/vfio: EEH support for VFIO PCI device

2014-05-28 Thread Benjamin Herrenschmidt
On Thu, 2014-05-29 at 10:05 +1000, Gavin Shan wrote: The log stuff is TBD and I'll figure it out later. About to what are the errors, there are a lot. Most of them are related to hardware level, for example unstable PCI link. Usually, those error bits defined in AER fatal error state

Re: [PATCH v7 3/3] drivers/vfio: EEH support for VFIO PCI device

2014-05-27 Thread Benjamin Herrenschmidt
On Tue, 2014-05-27 at 12:15 -0600, Alex Williamson wrote: +/* + * Reset is the major step to recover problematic PE. The following + * command helps on that. + */ +struct vfio_eeh_pe_reset { + __u32 argsz; + __u32 flags; + __u32 option; +#define

Re: [PATCH v7 3/3] drivers/vfio: EEH support for VFIO PCI device

2014-05-27 Thread Benjamin Herrenschmidt
On Tue, 2014-05-27 at 14:37 -0600, Alex Williamson wrote: The usual way is the driver asks for one or the other, this plumbs back into the guest EEH code which itself plumbs into the PCIe error recovery framework in Linux. So magic? Yes. The driver is expected to more or less knows what

Re: [PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device

2014-05-22 Thread Benjamin Herrenschmidt
On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote: There's no notification, the user needs to observe the return value an poll? Should we be enabling an eventfd to notify the user of the state change? Yes. The user needs to monitor the return value. we should have one notification,

Re: [PATCH 3/4] drivers/vfio: New IOCTL command VFIO_EEH_INFO

2014-05-21 Thread Benjamin Herrenschmidt
On Wed, 2014-05-21 at 08:23 +0200, Alexander Graf wrote: Note to Alex: This definitely kills the notifier idea for now though, at least as a first class citizen of the design. We can add it as an optional optimization on top later. I don't think it does. The notifier would just get

Re: [PATCH v5 3/4] drivers/vfio: EEH support for VFIO PCI device

2014-05-21 Thread Benjamin Herrenschmidt
On Wed, 2014-05-21 at 15:07 +0200, Alexander Graf wrote: +#ifdef CONFIG_VFIO_PCI_EEH +int eeh_vfio_open(struct pci_dev *pdev) Why vfio? Also that config option will not be set if vfio is compiled as a module. +{ + struct eeh_dev *edev; + + /* No PCI device ? */ + if

Re: [PATCH 4/4] powerpc/eeh: Avoid event on passed PE

2014-05-20 Thread Benjamin Herrenschmidt
On Tue, 2014-05-20 at 21:56 +1000, Gavin Shan wrote: .../... I think what you want is an irqfd that the in-kernel eeh code notifies when it sees a failure. When such an fd exists, the kernel skips its own error handling. Yeah, it's a good idea and something for me to improve in phase

Re: [PATCH 4/4] powerpc/eeh: Avoid event on passed PE

2014-05-20 Thread Benjamin Herrenschmidt
On Tue, 2014-05-20 at 15:49 +0200, Alexander Graf wrote: Instead of if (passed_flag) return; you would do if (trigger_irqfd) { trigger_irqfd(); return; } which would be a much nicer, generic interface. But that's not how PAPR works. Cheers, Ben. -- To

Re: [PATCH 4/4] powerpc/eeh: Avoid event on passed PE

2014-05-20 Thread Benjamin Herrenschmidt
On Tue, 2014-05-20 at 15:49 +0200, Alexander Graf wrote: So how about we just implement this whole thing properly as irqfd? Whether QEMU can actually do anything with the interrupt is a different question - we can leave it be for now. But we could model all the code with the assumption that

Re: [PATCH 3/4] drivers/vfio: New IOCTL command VFIO_EEH_INFO

2014-05-20 Thread Benjamin Herrenschmidt
On Tue, 2014-05-20 at 14:25 +0200, Alexander Graf wrote: - Move eeh-vfio.c to drivers/vfio/pci/ - From eeh-vfio.c, dereference arch/powerpc/kernel/eeh.c::eeh_ops, which is arch/powerpc/plaforms/powernv/eeh-powernv.c::powernv_eeh_ops. Call Hrm, I think it'd be nicer to just export

Re: [PATCH 3/4] drivers/vfio: New IOCTL command VFIO_EEH_INFO

2014-05-20 Thread Benjamin Herrenschmidt
On Tue, 2014-05-20 at 22:39 +1000, Gavin Shan wrote: Yeah. How about this? :-) - Move eeh-vfio.c to drivers/vfio/pci/ - From eeh-vfio.c, dereference arch/powerpc/kernel/eeh.c::eeh_ops, which is arch/powerpc/plaforms/powernv/eeh-powernv.c::powernv_eeh_ops. Call Hrm, I think it'd be

Re: [PATCH 2/8] powerpc/eeh: Info to trace passed devices

2014-05-19 Thread Benjamin Herrenschmidt
On Mon, 2014-05-19 at 14:46 +0200, Alexander Graf wrote: I don't see the point of VFIO knowing about guest addresses. They are not unique across a system and the whole idea that a VFIO device has to be owned by a guest is also pretty dubious. I suppose what you really care about here is

Re: [PATCH 6/8] powerpc: Extend syscall ppc_rtas()

2014-05-19 Thread Benjamin Herrenschmidt
On Mon, 2014-05-19 at 14:55 +0200, Alexander Graf wrote: On 14.05.14 06:12, Gavin Shan wrote: Originally, syscall ppc_rtas() can be used to invoke RTAS call from user space. Utility errinjct is using it to inject various errors to the system for testing purpose. The patch intends to extend

Re: [PATCH 8/8] powerpc/powernv: Error injection infrastructure

2014-05-19 Thread Benjamin Herrenschmidt
On Mon, 2014-05-19 at 15:04 +0200, Alexander Graf wrote: On 14.05.14 06:12, Gavin Shan wrote: The patch intends to implement the error injection infrastructure for PowerNV platform. The predetermined handlers will be called according to the type of injected error (e.g.

Re: [PATCH 3/8] drivers/vfio: New IOCTL command VFIO_EEH_INFO

2014-05-19 Thread Benjamin Herrenschmidt
for those PCI devices, + * which have been passed through from host to guest via VFIO. So this + * file is naturally part of VFIO implementation on PowerNV platform. + * + * Copyright Benjamin Herrenschmidt Gavin Shan, IBM Corporation 2014. + * + * This program is free software; you

Re: [PATCH RFC 00/22] EEH Support for VFIO PCI devices on PowerKVM guest

2014-05-06 Thread Benjamin Herrenschmidt
On Tue, 2014-05-06 at 08:56 +0200, Alexander Graf wrote: For the error injection, I guess I have to put the logic token management into QEMU and error injection request will be handled by QEMU and then routed to host kernel via additional syscall as we did for pSeries. Yes, start off

Re: [PATCH] KVM: PPC: BOOK3S: HV: Don't try to allocate from kernel page allocator for hash page table.

2014-05-06 Thread Benjamin Herrenschmidt
On Tue, 2014-05-06 at 09:05 +0200, Alexander Graf wrote: On 06.05.14 02:06, Benjamin Herrenschmidt wrote: On Mon, 2014-05-05 at 17:16 +0200, Alexander Graf wrote: Isn't this a greater problem? We should start swapping before we hit the point where non movable kernel allocation fails

Re: [RFC PATCH] KVM: PPC: BOOK3S: HV: THP support for guest

2014-05-06 Thread Benjamin Herrenschmidt
On Tue, 2014-05-06 at 11:12 +0200, Alexander Graf wrote: So if I understand this patch correctly, it simply introduces logic to handle page sizes other than 4k, 64k, 16M by analyzing the actual page size field in the HPTE. Mind to explain why exactly that enables us to use THP? What

Re: [RFC PATCH] KVM: PPC: BOOK3S: HV: THP support for guest

2014-05-06 Thread Benjamin Herrenschmidt
On Tue, 2014-05-06 at 21:38 +0530, Aneesh Kumar K.V wrote: I updated the commit message as below. Let me know if this is ok. KVM: PPC: BOOK3S: HV: THP support for guest This has nothing to do with THP. THP support in guest depend on KVM advertising MPSS feature. We already have

Re: [PATCH V4] POWERPC: BOOK3S: KVM: Use the saved dar value and generic make_dsisr

2014-05-05 Thread Benjamin Herrenschmidt
On Mon, 2014-05-05 at 19:56 +0530, Aneesh Kumar K.V wrote: Paul mentioned that BOOK3S always had DAR value set on alignment interrupt. And the patch is to enable/collect correct DAR value when running with Little Endian PR guest. Now to limit the impact and to enable Little Endian PR guest,

Re: [PATCH V4] POWERPC: BOOK3S: KVM: Use the saved dar value and generic make_dsisr

2014-05-05 Thread Benjamin Herrenschmidt
On Mon, 2014-05-05 at 16:43 +0200, Alexander Graf wrote: Paul mentioned that BOOK3S always had DAR value set on alignment interrupt. And the patch is to enable/collect correct DAR value when running with Little Endian PR guest. Now to limit the impact and to enable Little Endian PR guest,

Re: [PATCH] PPC: KVM: Introduce hypervisor call H_GET_TCE

2014-02-21 Thread Benjamin Herrenschmidt
On Fri, 2014-02-21 at 16:31 +0100, Laurent Dufour wrote: This fix introduces the H_GET_TCE hypervisor call which is basically the reverse of H_PUT_TCE, as defined in the Power Architecture Platform Requirements (PAPR). The hcall H_GET_TCE is required by the kdump kernel which is calling it

Re: [RFC PATCH 02/10] KVM: PPC: BOOK3S: PR: Emulate virtual timebase register

2014-01-29 Thread Benjamin Herrenschmidt
On Wed, 2014-01-29 at 17:39 +0100, Alexander Graf wrote: static inline mfvtb(unsigned long) { #ifdef CONFIG_PPC_BOOK3S_64 return mfspr(SPRN_VTB); #else BUG(); #endif } is a lot easier to read and get right. But reg.h is Ben's call. Agreed. Also could you please give me a

Re: [PATCH] KVM: PPC: Use schedule instead of cond_resched

2013-12-10 Thread Benjamin Herrenschmidt
On Tue, 2013-12-10 at 15:40 +0100, Alexander Graf wrote: On 10.12.2013, at 15:21, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com We already checked need_resched. So we can call schedule directly Signed-off-by: Aneesh

Re: Error in frreing hugepages with preemption enabled

2013-12-03 Thread Benjamin Herrenschmidt
On Tue, 2013-12-03 at 23:21 +0100, Andrea Arcangeli wrote: #ifdef CONFIG_PPC_FSL_BOOK3E hugepd_free(tlb, hugepte); ^^ This is the culprit (Alex, you didn't specify this was embedded or did I miss it ?) #else pgtable_free_tlb(tlb, hugepte,

Re: [PATCH 3/3] powerpc/kvm: remove redundant assignment

2013-11-07 Thread Benjamin Herrenschmidt
On Thu, 2013-11-07 at 09:14 +0100, Alexander Graf wrote: And ? An explanation isn't going to be clearer than the code in that case ... It's pretty non-obvious when you do a git show on that patch in 1 year from now, as the redundancy is out of scope of what the diff shows. And ? How would

Re: [PATCH] powerpc: kvm: optimize sc 0 as fast return

2013-11-07 Thread Benjamin Herrenschmidt
On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote: On 08.11.2013, at 03:44, Liu Ping Fan kernelf...@gmail.com wrote: syscall is a very common behavior inside guest, and this patch optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL, so hypervisor can return to guest

Re: [PATCH 3/3] powerpc/kvm: remove redundant assignment

2013-11-06 Thread Benjamin Herrenschmidt
On Thu, 2013-11-07 at 08:52 +0100, Alexander Graf wrote: Am 06.11.2013 um 20:58 schrieb Benjamin Herrenschmidt b...@kernel.crashing.org: On Wed, 2013-11-06 at 12:24 +0100, Alexander Graf wrote: On 05.11.2013, at 08:42, Liu Ping Fan kernelf...@gmail.com wrote: Signed-off-by: Liu Ping

Re: [PULL 34/51] powerpc: move debug registers in a structure

2013-11-03 Thread Benjamin Herrenschmidt
On Sun, 2013-11-03 at 16:30 +0200, Gleb Natapov wrote: On Thu, Oct 31, 2013 at 10:18:19PM +0100, Alexander Graf wrote: From: Bharat Bhushan r65...@freescale.com This way we can use same data type struct with KVM and also help in using other debug related function. Signed-off-by:

Re: [PULL 34/51] powerpc: move debug registers in a structure

2013-11-03 Thread Benjamin Herrenschmidt
On Mon, 2013-11-04 at 07:43 +0100, Alexander Graf wrote: Yeah, it's what Ben and me agreed on after I waited forever to get a topic branch created. Oh well, I guess this time we just have to manually resolve the conflicts and do a better job at communicating next time. That specific one was

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-03 Thread Benjamin Herrenschmidt
On Thu, 2013-10-03 at 08:43 +0300, Gleb Natapov wrote: Why it can be a bad idea? User can drain hwrng continuously making other users of it much slower, or even worse, making them fall back to another much less reliable, source of entropy. Not in a very significant way, we generate entropy at

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-02 Thread Benjamin Herrenschmidt
On Wed, 2013-10-02 at 10:46 +0200, Paolo Bonzini wrote: Thanks. Any chance you can give some numbers of a kernel hypercall and a userspace hypercall on Power, so we have actual data? For example a hypercall that returns H_PARAMETER as soon as possible. I don't have (yet) numbers at hand

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-02 Thread Benjamin Herrenschmidt
On Wed, 2013-10-02 at 11:11 +0200, Alexander Graf wrote: Right, and the difference for the patch in question is really whether we handle in in kernel virtual mode or in QEMU, so the bulk of the overhead (kicking threads out of guest context, switching MMU context, etc) happens either way.

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-02 Thread Benjamin Herrenschmidt
On Wed, 2013-10-02 at 13:02 +0300, Gleb Natapov wrote: Yes, I alluded to it in my email to Paul and Paolo asked also. How this interface is disabled? Also hwrnd is MMIO in a host why guest needs to use hypercall instead of emulating the device (in kernel or somewhere else?). Migration will

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-02 Thread Benjamin Herrenschmidt
On Wed, 2013-10-02 at 13:02 +0300, Gleb Natapov wrote: Yes, I alluded to it in my email to Paul and Paolo asked also. How this interface is disabled? Also hwrnd is MMIO in a host why guest needs to use hypercall instead of emulating the device (in kernel or somewhere else?). Another things is

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-02 Thread Benjamin Herrenschmidt
On Wed, 2013-10-02 at 17:10 +0300, Gleb Natapov wrote: The hwrng is accessible by host userspace via /dev/mem. Regular user has no access to /dev/mem, but he can start kvm guest and gain access to the device. Seriously. You guys are really trying hard to make our life hell or what ? That

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-02 Thread Benjamin Herrenschmidt
On Wed, 2013-10-02 at 17:37 +0300, Gleb Natapov wrote: On Wed, Oct 02, 2013 at 04:33:18PM +0200, Paolo Bonzini wrote: Il 02/10/2013 16:08, Alexander Graf ha scritto: The hwrng is accessible by host userspace via /dev/mem. A guest should live on the same permission level as a user

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-01 Thread Benjamin Herrenschmidt
On Tue, 2013-10-01 at 11:39 +0300, Gleb Natapov wrote: On Tue, Oct 01, 2013 at 06:34:26PM +1000, Michael Ellerman wrote: On Thu, Sep 26, 2013 at 11:06:59AM +0200, Paolo Bonzini wrote: Il 26/09/2013 08:31, Michael Ellerman ha scritto: Some powernv systems include a hwrng. Guests can

Re: [PATCH 3/3] KVM: PPC: Book3S: Add support for hwrng found on some powernv systems

2013-10-01 Thread Benjamin Herrenschmidt
On Tue, 2013-10-01 at 13:19 +0200, Paolo Bonzini wrote: Il 01/10/2013 11:38, Benjamin Herrenschmidt ha scritto: So for the sake of that dogma you are going to make us do something that is about 100 times slower ? (and possibly involves more lines of code) If it's 100 times slower

Re: [PATCH 1/3] powerpc: Implement arch_get_random_long/int() for powernv

2013-09-26 Thread Benjamin Herrenschmidt
On Thu, 2013-09-26 at 16:31 +1000, Michael Ellerman wrote: + pr_info_once(registering arch random hook\n); Either pr_debug or make it nicer looking :-) Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org

Re: [PATCH 2/3] hwrng: Add a driver for the hwrng found in power7+ systems

2013-09-26 Thread Benjamin Herrenschmidt
On Thu, 2013-09-26 at 16:31 +1000, Michael Ellerman wrote: + pr_info(registered powernv hwrng.\n); First letter of a line should get a capital :-) Also since it's per-device, at least indicate the OF path or the chip number or something ... Cheers, Ben. -- To unsubscribe from this

Re: [PATCH 05/11] KVM: PPC: Book3S HV: Add support for guest Program Priority Register

2013-09-16 Thread Benjamin Herrenschmidt
early on, and set the thread priority to the medium level, so that the interrupt handling code runs at a reasonable speed. Signed-off-by: Paul Mackerras pau...@samba.org Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org Alex, can you take this via your tree ? Cheers, Ben

Re: [PATCH 19/23] KVM: PPC: Book3S: Select PR vs HV separately for each guest

2013-09-12 Thread Benjamin Herrenschmidt
On Fri, 2013-09-13 at 10:17 +1000, Paul Mackerras wrote: Aneesh and I are currently investigating an alternative approach, which is much more like the x86 way of doing things. We are looking at splitting the code into three modules: a kvm_pr.ko module with the PR-specific bits, a kvm_hv.ko

Re: [PATCH v9 12/13] KVM: PPC: Add support for IOMMU in-kernel handling

2013-09-04 Thread Benjamin Herrenschmidt
On Tue, 2013-09-03 at 13:53 +0300, Gleb Natapov wrote: Or supporting all IOMMU links (and leaving emulated stuff as is) in on device is the last thing I have to do and then you'll ack the patch? I am concerned more about API here. Internal implementation details I leave to powerpc experts

Re: [PATCH 02/10] KVM: PPC: reserve a capability number for multitce support

2013-08-27 Thread Benjamin Herrenschmidt
On Tue, 2013-08-27 at 09:40 +0300, Gleb Natapov wrote: Thanks. Since it's not in a topic branch that I can pull, I'm going to just cherry-pick them. However, they are in your queue branch, not next branch. Should I still assume this is a stable branch and that the numbers aren't going to

Re: [PATCH 02/10] KVM: PPC: reserve a capability number for multitce support

2013-08-27 Thread Benjamin Herrenschmidt
On Tue, 2013-08-27 at 09:41 +0300, Gleb Natapov wrote: Oh and Alexey mentions that there are two capabilities and you only applied one :-) Another one is: [PATCH v8] KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO ? Yes, thanks ! Cheers, Ben. -- To unsubscribe

Re: [PATCH 02/10] KVM: PPC: reserve a capability number for multitce support

2013-08-26 Thread Benjamin Herrenschmidt
On Mon, 2013-08-26 at 15:37 +0300, Gleb Natapov wrote: Gleb, any chance you can put this (and the next one) into a tree to lock in the numbers ? Applied it. Sorry for slow response, was on vocation and still go through the email backlog. Thanks. Since it's not in a topic branch that I

Re: [PATCH] powerpc/kvm: Handle the boundary condition correctly

2013-08-22 Thread Benjamin Herrenschmidt
On Fri, 2013-08-23 at 09:01 +0530, Aneesh Kumar K.V wrote: Alexander Graf ag...@suse.de writes: On 22.08.2013, at 12:37, Aneesh Kumar K.V wrote: From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Isn't this you? Yes. The patches are generated using git format-patch and sent by

Re: [v2][PATCH 1/1] KVM: PPC: Book3E HV: call SOFT_DISABLE_INTS to sync the software state

2013-08-14 Thread Benjamin Herrenschmidt
On Wed, 2013-08-14 at 14:34 -0500, Scott Wood wrote: On Wed, 2013-08-14 at 13:56 +0200, Alexander Graf wrote: On 07.08.2013, at 04:05, Tiejun Chen wrote: We enter with interrupts disabled in hardware, but we need to call SOFT_DISABLE_INTS anyway to ensure that the software state is

Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-02 Thread Benjamin Herrenschmidt
On Fri, 2013-08-02 at 17:58 -0500, Scott Wood wrote: What about 64-bit PTEs on 32-bit kernels? In any case, this code does not belong in KVM. It should be in the main PPC mm code, even if KVM is the only user. Also don't we do similar things in BookS KVM ? At the very least that sutff

Re: [PATCH 5/6 v2] kvm: powerpc: booke: Add linux pte lookup like booke3s

2013-08-02 Thread Benjamin Herrenschmidt
On Sat, 2013-08-03 at 02:58 +, Bhushan Bharat-R65777 wrote: One of the problem I saw was that if I put this code in asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and other friend function (on which this code depends) are defined in pgtable.h. And pgtable.h includes

Re: [PATCH 6/6 v2] kvm: powerpc: use caching attributes as per linux pte

2013-08-02 Thread Benjamin Herrenschmidt
On Sat, 2013-08-03 at 03:11 +, Bhushan Bharat-R65777 wrote: Could you explain why we need to set dirty/referenced on the PTE, when we didn't need to do that before? All we're getting from the PTE is wimg. We have MMU notifiers to take care of the page being unmapped, and we've

Re: [PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-26 Thread Benjamin Herrenschmidt
On Fri, 2013-07-26 at 15:03 +, Bhushan Bharat-R65777 wrote: Will not searching the Linux PTE is a overkill? That's the best approach. Also we are searching it already to resolve the page fault. That does mean we search twice but on the other hand that also means it's hot in the cache.

Re: [PATCH 04/10] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-07-24 Thread Benjamin Herrenschmidt
On Wed, 2013-07-24 at 15:43 -0700, Andrew Morton wrote: For what? The three lines of comment in page-flags.h? ack :) Manipulating page-_count directly is considered poor form. Don't blame us if we break your code ;) Actually, the manipulation in realmode_get_page() duplicates the

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-14 Thread Benjamin Herrenschmidt
On Mon, 2013-07-15 at 10:20 +0800, tiejun.chen wrote: What about SOFT_IRQ_DISABLE? This is close to name hard_irq_disable() :) And then remove all DISABLE_INTS as well? Or RECONCILE_IRQ_STATE... Ben. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-12 Thread Benjamin Herrenschmidt
On Fri, 2013-07-12 at 12:50 -0500, Scott Wood wrote: [1] SOFT_DISABLE_INTS seems an odd name for something that updates the software state to be consistent with interrupts being *hard* disabled. I can sort of see the logic in it, but it's confusing when first encountered. From the

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 11:49 +0200, Alexander Graf wrote: Ben, is soft_enabled == 0; hard_enabled == 1 a valid combination that may ever occur? Yes of course, that's what we call soft disabled :-) It's even the whole point of doing lazy disable... Ben. -- To unsubscribe from this list: send

Re: [PATCH 8/8] KVM: PPC: Add hugepage support for IOMMU in-kernel handling

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 11:52 +0200, Alexander Graf wrote: Where exactly (it is rather SPAPR_TCE_IOMMU but does not really matter)? Select it on KVM_BOOK3S_64? CONFIG_KVM_BOOK3S_64_HV? CONFIG_KVM_BOOK3S_64_PR? PPC_BOOK3S_64? I'd say the most logical choice would be to check the Makefile

Re: [PATCH 6/8] KVM: PPC: Add support for multiple-TCE hcalls

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 13:15 +0200, Alexander Graf wrote: There are 2 ways of dealing with this: 1) Call the ENABLE_CAP on every vcpu. That way one CPU may handle this hypercall in the kernel while another one may not. The same as we handle PAPR today. 2) Create a new ENABLE_CAP for

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 14:47 +0200, Alexander Graf wrote: Yes of course, that's what we call soft disabled :-) It's even the whole point of doing lazy disable... Meh. Of course it's soft_enabled = 1; hard_enabled = 0. That doesn't happen in normal C code. It happens under very specific

Re: [PATCH 8/8] KVM: PPC: Add hugepage support for IOMMU in-kernel handling

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 14:50 +0200, Alexander Graf wrote: Not really no. But that would do. You could have give a more useful answer in the first place though rather than stringing him along. Sorry, I figured it was obvious. It wasn't no, because of the mess with modules and the nasty

Re: [PATCH 6/8] KVM: PPC: Add support for multiple-TCE hcalls

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 14:51 +0200, Alexander Graf wrote: I don't like bloat usually. But Alexey even had an #ifdef DEBUG in there to selectively disable in-kernel handling of multi-TCE. Not calling ENABLE_CAP would give him exactly that without ugly #ifdefs in the kernel. I don't see much

Re: [PATCH 6/8] KVM: PPC: Add support for multiple-TCE hcalls

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 15:12 +1000, Alexey Kardashevskiy wrote: Any debug code is prohibited? Ok, I'll remove. Debug code that requires code changes is prohibited, yes. Debug code that is runtime switchable (pr_debug, trace points, etc) are allowed. Bollox. $ grep DBG\( arch/powerpc/

Re: [PATCH 6/8] KVM: PPC: Add support for multiple-TCE hcalls

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 12:11 +0200, Alexander Graf wrote: So I must add one more ioctl to enable MULTITCE in kernel handling. Is it what you are saying? I can see KVM_CHECK_EXTENSION but I do not see KVM_ENABLE_EXTENSION or anything like that. KVM_ENABLE_CAP. It's how we enable sPAPR

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 15:07 +0200, Alexander Graf wrote: Ok, let me quickly explain the problem. We are leaving host context, switching slowly into guest context. During that transition we call get_paca() indirectly (apparently by another call to hard_disable() which sounds bogus, but that's

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-11 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 11:18 -0500, Scott Wood wrote: If we set IRQs as soft-disabled prior to calling hard_irq_disable(), then hard_irq_disable() will fail to call trace_hardirqs_off(). Sure because setting them as soft-disabled will have done it. However by doing so, you also create the

Re: [v1][PATCH 1/1] KVM: PPC: disable preemption when using hard_irq_disable()

2013-07-11 Thread Benjamin Herrenschmidt
On Fri, 2013-07-12 at 10:13 +0800, tiejun.chen wrote: #define hard_irq_disable()do {\ u8 _was_enabled = get_paca()-soft_enabled; \ Current problem I met is issued from the above line. __hard_irq_disable(); \ -

Re: [PATCH 8/8] KVM: PPC: Add hugepage support for IOMMU in-kernel handling

2013-07-10 Thread Benjamin Herrenschmidt
On Wed, 2013-07-10 at 12:33 +0200, Alexander Graf wrote: It's not exactly obvious that you're calling it with writing == 1 :). Can you create a new local variable is_write in the calling function, set that to 1 before the call to get_user_pages_fast and pass it in instead of the 1? The

Re: [PATCH 2/3] kvm/ppc: IRQ disabling cleanup

2013-07-10 Thread Benjamin Herrenschmidt
On Thu, 2013-07-11 at 00:57 +0200, Alexander Graf wrote: #ifdef CONFIG_PPC64 + /* + * To avoid races, the caller must have gone directly from having + * interrupts fully-enabled to hard-disabled. + */ + WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS);

Re: [PATCH 5/8] powerpc: add real mode support for dma operations on powernv

2013-07-09 Thread Benjamin Herrenschmidt
On Tue, 2013-07-09 at 18:02 +0200, Alexander Graf wrote: On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote: The existing TCE machine calls (tce_build and tce_free) only support virtual mode as they call __raw_writeq for TCE invalidation what fails in real mode. This introduces

Re: [PATCH 4/8] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-07-07 Thread Benjamin Herrenschmidt
On Sun, 2013-07-07 at 01:07 +1000, Alexey Kardashevskiy wrote: The current VFIO-on-POWER implementation supports only user mode driven mapping, i.e. QEMU is sending requests to map/unmap pages. However this approach is really slow, so we want to move that to KVM. Since H_PUT_TCE can be

Re: [PATCH 2/2] KVM: PPC: Book3E: Add LRAT error exception handler

2013-07-04 Thread Benjamin Herrenschmidt
On Thu, 2013-07-04 at 06:47 +, Caraman Mihai Claudiu-B02008 wrote: This is a solid reason. Ben it's ok for you to apply the combined patch? If so I will respin it. Sure, but nowadays, all that stuff goes via Scott and Alex. Cheers, Ben. -- To unsubscribe from this list: send the line

Re: [PATCH -V3 2/4] powerpc/kvm: Contiguous memory allocator based hash page table allocation

2013-07-02 Thread Benjamin Herrenschmidt
On Tue, 2013-07-02 at 17:12 +0200, Alexander Graf wrote: Is CMA a mandatory option in the kernel? Or can it be optionally disabled? If it can be disabled, we should keep the preallocated fallback case around for systems that have CMA disabled. Why ? More junk code to keep around ... If CMA

Re: [PATCH 3/8] vfio: add external user support

2013-06-27 Thread Benjamin Herrenschmidt
On Thu, 2013-06-27 at 16:59 +1000, Stephen Rothwell wrote: +/* Allows an external user (for example, KVM) to unlock an IOMMU group */ +static void vfio_group_del_external_user(struct file *filep) +{ + struct vfio_group *group = filep-private_data; + + BUG_ON(filep-f_op !=

Re: [PATCH 3/4] KVM: PPC: Add support for IOMMU in-kernel handling

2013-06-23 Thread Benjamin Herrenschmidt
On Mon, 2013-06-24 at 13:54 +1000, David Gibson wrote: DDW means an API by which the guest can request the creation of additional iommus for a given device (typically, in addition to the default smallish 32-bit one using 4k pages, the guest can request a larger window in 64-bit space using

  1   2   3   >