On Fri, 2015-09-04 at 12:28 +0200, Thomas Huth wrote:
> > Maybe some rcu protected scheme that doubles the amount of memslots
> > for
> > each overrun? Yes, that would be good and even reduce the footprint
> > for
> > systems with only a small number of memslots.
>
> Seems like Alex Williamson
On Wed, 2015-09-02 at 08:24 +1000, Paul Mackerras wrote:
> On Tue, Sep 01, 2015 at 11:41:18PM +0200, Thomas Huth wrote:
> > The size of the Problem State Priority Boost Register is only
> > 32 bits, so let's change the type of the corresponding variable
> > accordingly to avoid future trouble.
>
On Wed, 2015-09-02 at 08:45 +1000, Paul Mackerras wrote:
> On Wed, Sep 02, 2015 at 08:25:05AM +1000, Benjamin Herrenschmidt
> wrote:
> > On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote:
> > > The size of the Problem State Priority Boost Register is only
> > > 32
On Tue, 2015-09-01 at 23:41 +0200, Thomas Huth wrote:
> The size of the Problem State Priority Boost Register is only
> 32 bits, so let's change the type of the corresponding variable
> accordingly to avoid future trouble.
It's not future trouble, it's broken today for LE and this should fix
it
On Tue, 2015-07-14 at 20:43 +0200, Thomas Huth wrote:
Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
mdelay() instead of msleep() in rtas_busy_delay()? Something more
fancy?
A proper fix would be more fancy, the get_sensor should happen in a
kernel thread instead.
Cheers,
On Mon, 2015-03-30 at 16:46 -0700, Joe Perches wrote:
Use the normal return values for bool functions
Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Should we merge it or will you ?
Cheers,
Ben.
Signed-off-by: Joe Perches j...@perches.com
---
arch/powerpc/include/asm/dcr
On Mon, 2015-03-30 at 10:39 +0530, Aneesh Kumar K.V wrote:
This patch remove helpers which we had used only once in the code.
Limiting page table walk variants help in ensuring that we won't
end up with code walking page table with wrong assumptions.
Signed-off-by: Aneesh Kumar K.V
On Mon, 2014-08-11 at 09:26 +0200, Alexander Graf wrote:
diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index da86d9b..d95014e 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
This should be book3s_emulate.c.
Any reason we can't make that
-by: Alexander Graf ag...@suse.de
Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org
---
arch/powerpc/include/asm/asm-compat.h | 4
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/include/asm/asm-compat.h
b/arch/powerpc/include/asm/asm-compat.h
index 4b237aa..21be8ae 100644
is that it uses
kvm_unmap_rmapp() which will also lock the HPTE (try_lock_hpte)
and so shouldn't have a race vs the above code.
Or do you see a race I don't ?
Cheers,
Ben.
Thx.
Fan
On Mon, Jul 28, 2014 at 2:42 PM, Benjamin Herrenschmidt
b...@kernel.crashing.org wrote:
On Mon, 2014-07-28 at 14:09
On Mon, 2014-07-28 at 14:09 +0800, Liu Ping Fan wrote:
In current code, the setup of hpte is under the risk of race with
mmu_notifier_invalidate, i.e we may setup a hpte with a invalid pfn.
Resolve this issue by sync the two actions by KVMPPC_RMAP_LOCK_BIT.
Please describe the race you think
On Mon, 2014-07-21 at 16:06 +0800, Mike Qiu wrote:
I don't like this. I much prefer have dedicated error injection files
in their respective locations, something for PCI under the corresponding
PCI bridge etc...
So PowerNV error injection will be designed rely on debugfs been
On Tue, 2014-07-22 at 11:10 +0800, Mike Qiu wrote:
On 07/22/2014 06:49 AM, Benjamin Herrenschmidt wrote:
On Mon, 2014-07-21 at 16:06 +0800, Mike Qiu wrote:
I don't like this. I much prefer have dedicated error injection files
in their respective locations, something for PCI under
Is it reasonable to do error injection with CONFIG_IOMMU_API ?
That means if use default config(CONFIG_IOMMU_API = n), we can not do
error injection to pci devices?
Well we can't pass them through either so ...
In any case, this is not a priority. First we need to implement a solid
error
On Tue, 2014-06-24 at 14:57 +0800, Mike Qiu wrote:
Is that mean *host* side error injection should base on
CONFIG_IOMMU_API ? If it is just host side(no guest, no pass through),
can't we do error inject?
Maybe I misunderstand :)
Ah no, make different patches, we don't want to use IOMMU
On Wed, 2014-06-25 at 11:05 +0800, Mike Qiu wrote:
Here maybe /sys/kernel/debug/powerpc/errinjct is better, because
it
will supply PCI_domain_nr in parameters, so no need supply errinjct
for each PCI domain.
Another reason is error inject not only for PCI(in future), so better
not in
On Mon, 2014-06-23 at 12:14 +1000, Gavin Shan wrote:
The patch synchronizes firmware header file (opal.h) for PCI error
injection
The FW API you expose is not PCI specific. I haven't seen the
corresponding FW patches yet but I'm not fan of that single call
that collates unrelated things.
I
On Tue, 2014-06-17 at 11:25 +0200, Alexander Graf wrote:
On 17.06.14 11:22, Benjamin Herrenschmidt wrote:
On Tue, 2014-06-17 at 10:54 +0200, Alexander Graf wrote:
Also, why don't we use twi always or something else that actually is
defined as illegal instruction? I would like to see
On Thu, 2014-06-05 at 16:36 +1000, Gavin Shan wrote:
+#define EEH_OPT_GET_PE_ADDR0 /* Get PE addr */
+#define EEH_OPT_GET_PE_MODE1 /* Get PE mode */
I assume that's just some leftover from the previous patches :-)
Don't respin just yet, let's see what other comments come
On Thu, 2014-06-05 at 17:25 +1000, Alexey Kardashevskiy wrote:
+This creates a virtual TCE (translation control entry) table, which
+is an IOMMU for PAPR-style virtual I/O. It is used to translate
+logical addresses used in virtual I/O into guest physical addresses,
+and provides a
On Thu, 2014-06-05 at 19:26 +1000, Alexey Kardashevskiy wrote:
No trees yet. For 64GB window we need (6430)/(1620)*8 = 32K TCE table.
Do we really need trees?
The above is assuming hugetlbfs backed guests. These are the least of my worry
indeed. But we need to deal with 4k and 64k guests.
On Thu, 2014-06-05 at 13:56 +0200, Alexander Graf wrote:
What if we ask user space to give us a pointer to user space allocated
memory along with the TCE registration? We would still ask user space to
only use the returned fd for TCE modifications, but would have some
nicely swappable
On Tue, 2014-06-03 at 09:45 +0200, Alexander Graf wrote:
For EEH it could as well be a dumb eventfd - really just a side
channel that can tell user space that something happened
asynchronously :).
Which the host kernel may have no way to detect without actively poking
at the device (fences in
On Thu, 2014-05-29 at 23:27 +0200, Alexander Graf wrote:
On 29.05.14 09:45, Michael Neuling wrote:
+/* Values for 2nd argument to H_SET_MODE */
+#define H_SET_MODE_RESOURCE_SET_CIABR1
+#define H_SET_MODE_RESOURCE_SET_DAWR2
+#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE3
On Wed, 2014-05-28 at 22:49 +1000, Gavin Shan wrote:
I will remove those address related macros in next revision because it's
user-level bussiness, not related to host kernel any more.
If the user is QEMU + guest, we need the address to identify the PE though PHB
BUID could be used as same
On Thu, 2014-05-29 at 10:05 +1000, Gavin Shan wrote:
The log stuff is TBD and I'll figure it out later.
About to what are the errors, there are a lot. Most of them are related
to hardware level, for example unstable PCI link. Usually, those error
bits defined in AER fatal error state
On Tue, 2014-05-27 at 12:15 -0600, Alex Williamson wrote:
+/*
+ * Reset is the major step to recover problematic PE. The following
+ * command helps on that.
+ */
+struct vfio_eeh_pe_reset {
+ __u32 argsz;
+ __u32 flags;
+ __u32 option;
+#define
On Tue, 2014-05-27 at 14:37 -0600, Alex Williamson wrote:
The usual way is the driver asks for one or the other, this plumbs back
into the guest EEH code which itself plumbs into the PCIe error recovery
framework in Linux.
So magic?
Yes. The driver is expected to more or less knows what
On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote:
There's no notification, the user needs to observe the return value an
poll? Should we be enabling an eventfd to notify the user of the state
change?
Yes. The user needs to monitor the return value. we should have one
notification,
On Wed, 2014-05-21 at 08:23 +0200, Alexander Graf wrote:
Note to Alex: This definitely kills the notifier idea for now
though,
at least as a first class citizen of the design. We can add it as an
optional optimization on top later.
I don't think it does. The notifier would just get
On Wed, 2014-05-21 at 15:07 +0200, Alexander Graf wrote:
+#ifdef CONFIG_VFIO_PCI_EEH
+int eeh_vfio_open(struct pci_dev *pdev)
Why vfio? Also that config option will not be set if vfio is compiled as
a module.
+{
+ struct eeh_dev *edev;
+
+ /* No PCI device ? */
+ if
On Tue, 2014-05-20 at 21:56 +1000, Gavin Shan wrote:
.../...
I think what you want is an irqfd that the in-kernel eeh code
notifies when it sees a failure. When such an fd exists, the kernel
skips its own error handling.
Yeah, it's a good idea and something for me to improve in phase
On Tue, 2014-05-20 at 15:49 +0200, Alexander Graf wrote:
Instead of
if (passed_flag)
return;
you would do
if (trigger_irqfd) {
trigger_irqfd();
return;
}
which would be a much nicer, generic interface.
But that's not how PAPR works.
Cheers,
Ben.
--
To
On Tue, 2014-05-20 at 15:49 +0200, Alexander Graf wrote:
So how about we just implement this whole thing properly as irqfd?
Whether QEMU can actually do anything with the interrupt is a different
question - we can leave it be for now. But we could model all the code
with the assumption that
On Tue, 2014-05-20 at 14:25 +0200, Alexander Graf wrote:
- Move eeh-vfio.c to drivers/vfio/pci/
- From eeh-vfio.c, dereference arch/powerpc/kernel/eeh.c::eeh_ops,
which
is arch/powerpc/plaforms/powernv/eeh-powernv.c::powernv_eeh_ops.
Call
Hrm, I think it'd be nicer to just export
On Tue, 2014-05-20 at 22:39 +1000, Gavin Shan wrote:
Yeah. How about this? :-)
- Move eeh-vfio.c to drivers/vfio/pci/
- From eeh-vfio.c, dereference arch/powerpc/kernel/eeh.c::eeh_ops, which
is arch/powerpc/plaforms/powernv/eeh-powernv.c::powernv_eeh_ops. Call
Hrm, I think it'd be
On Mon, 2014-05-19 at 14:46 +0200, Alexander Graf wrote:
I don't see the point of VFIO knowing about guest addresses. They are
not unique across a system and the whole idea that a VFIO device has to
be owned by a guest is also pretty dubious.
I suppose what you really care about here is
On Mon, 2014-05-19 at 14:55 +0200, Alexander Graf wrote:
On 14.05.14 06:12, Gavin Shan wrote:
Originally, syscall ppc_rtas() can be used to invoke RTAS call from
user space. Utility errinjct is using it to inject various errors
to the system for testing purpose. The patch intends to extend
On Mon, 2014-05-19 at 15:04 +0200, Alexander Graf wrote:
On 14.05.14 06:12, Gavin Shan wrote:
The patch intends to implement the error injection infrastructure
for PowerNV platform. The predetermined handlers will be called
according to the type of injected error (e.g.
for those PCI devices,
+ * which have been passed through from host to guest via VFIO. So this
+ * file is naturally part of VFIO implementation on PowerNV platform.
+ *
+ * Copyright Benjamin Herrenschmidt Gavin Shan, IBM Corporation 2014.
+ *
+ * This program is free software; you
On Tue, 2014-05-06 at 08:56 +0200, Alexander Graf wrote:
For the error injection, I guess I have to put the logic token
management
into QEMU and error injection request will be handled by QEMU and
then
routed to host kernel via additional syscall as we did for pSeries.
Yes, start off
On Tue, 2014-05-06 at 09:05 +0200, Alexander Graf wrote:
On 06.05.14 02:06, Benjamin Herrenschmidt wrote:
On Mon, 2014-05-05 at 17:16 +0200, Alexander Graf wrote:
Isn't this a greater problem? We should start swapping before we hit
the point where non movable kernel allocation fails
On Tue, 2014-05-06 at 11:12 +0200, Alexander Graf wrote:
So if I understand this patch correctly, it simply introduces logic to
handle page sizes other than 4k, 64k, 16M by analyzing the actual page
size field in the HPTE. Mind to explain why exactly that enables us to
use THP?
What
On Tue, 2014-05-06 at 21:38 +0530, Aneesh Kumar K.V wrote:
I updated the commit message as below. Let me know if this is ok.
KVM: PPC: BOOK3S: HV: THP support for guest
This has nothing to do with THP.
THP support in guest depend on KVM advertising MPSS feature. We already
have
On Mon, 2014-05-05 at 19:56 +0530, Aneesh Kumar K.V wrote:
Paul mentioned that BOOK3S always had DAR value set on alignment
interrupt. And the patch is to enable/collect correct DAR value when
running with Little Endian PR guest. Now to limit the impact and to
enable Little Endian PR guest,
On Mon, 2014-05-05 at 16:43 +0200, Alexander Graf wrote:
Paul mentioned that BOOK3S always had DAR value set on alignment
interrupt. And the patch is to enable/collect correct DAR value when
running with Little Endian PR guest. Now to limit the impact and to
enable Little Endian PR guest,
On Fri, 2014-02-21 at 16:31 +0100, Laurent Dufour wrote:
This fix introduces the H_GET_TCE hypervisor call which is basically the
reverse of H_PUT_TCE, as defined in the Power Architecture Platform
Requirements (PAPR).
The hcall H_GET_TCE is required by the kdump kernel which is calling it
On Wed, 2014-01-29 at 17:39 +0100, Alexander Graf wrote:
static inline mfvtb(unsigned long)
{
#ifdef CONFIG_PPC_BOOK3S_64
return mfspr(SPRN_VTB);
#else
BUG();
#endif
}
is a lot easier to read and get right. But reg.h is Ben's call.
Agreed.
Also could you please give me a
On Tue, 2013-12-10 at 15:40 +0100, Alexander Graf wrote:
On 10.12.2013, at 15:21, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
wrote:
From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
We already checked need_resched. So we can call schedule directly
Signed-off-by: Aneesh
On Tue, 2013-12-03 at 23:21 +0100, Andrea Arcangeli wrote:
#ifdef CONFIG_PPC_FSL_BOOK3E
hugepd_free(tlb, hugepte);
^^
This is the culprit
(Alex, you didn't specify this was embedded or did I miss it ?)
#else
pgtable_free_tlb(tlb, hugepte,
On Thu, 2013-11-07 at 09:14 +0100, Alexander Graf wrote:
And ? An explanation isn't going to be clearer than the code in that
case ...
It's pretty non-obvious when you do a git show on that patch in 1 year
from now, as the redundancy is out of scope of what the diff shows.
And ? How would
On Fri, 2013-11-08 at 04:10 +0100, Alexander Graf wrote:
On 08.11.2013, at 03:44, Liu Ping Fan kernelf...@gmail.com wrote:
syscall is a very common behavior inside guest, and this patch
optimizes the path for the emulation of BOOK3S_INTERRUPT_SYSCALL,
so hypervisor can return to guest
On Thu, 2013-11-07 at 08:52 +0100, Alexander Graf wrote:
Am 06.11.2013 um 20:58 schrieb Benjamin Herrenschmidt
b...@kernel.crashing.org:
On Wed, 2013-11-06 at 12:24 +0100, Alexander Graf wrote:
On 05.11.2013, at 08:42, Liu Ping Fan kernelf...@gmail.com wrote:
Signed-off-by: Liu Ping
On Sun, 2013-11-03 at 16:30 +0200, Gleb Natapov wrote:
On Thu, Oct 31, 2013 at 10:18:19PM +0100, Alexander Graf wrote:
From: Bharat Bhushan r65...@freescale.com
This way we can use same data type struct with KVM and
also help in using other debug related function.
Signed-off-by:
On Mon, 2013-11-04 at 07:43 +0100, Alexander Graf wrote:
Yeah, it's what Ben and me agreed on after I waited forever to get a
topic branch created. Oh well, I guess this time we just have to
manually resolve the conflicts and do a better job at communicating
next time.
That specific one was
On Thu, 2013-10-03 at 08:43 +0300, Gleb Natapov wrote:
Why it can be a bad idea? User can drain hwrng continuously making other
users of it much slower, or even worse, making them fall back to another
much less reliable, source of entropy.
Not in a very significant way, we generate entropy at
On Wed, 2013-10-02 at 10:46 +0200, Paolo Bonzini wrote:
Thanks. Any chance you can give some numbers of a kernel hypercall and
a userspace hypercall on Power, so we have actual data? For example a
hypercall that returns H_PARAMETER as soon as possible.
I don't have (yet) numbers at hand
On Wed, 2013-10-02 at 11:11 +0200, Alexander Graf wrote:
Right, and the difference for the patch in question is really whether
we handle in in kernel virtual mode or in QEMU, so the bulk of the
overhead (kicking threads out of guest context, switching MMU
context, etc) happens either way.
On Wed, 2013-10-02 at 13:02 +0300, Gleb Natapov wrote:
Yes, I alluded to it in my email to Paul and Paolo asked also. How this
interface is disabled? Also hwrnd is MMIO in a host why guest needs to
use hypercall instead of emulating the device (in kernel or somewhere
else?).
Migration will
On Wed, 2013-10-02 at 13:02 +0300, Gleb Natapov wrote:
Yes, I alluded to it in my email to Paul and Paolo asked also. How this
interface is disabled? Also hwrnd is MMIO in a host why guest needs to
use hypercall instead of emulating the device (in kernel or somewhere
else?). Another things is
On Wed, 2013-10-02 at 17:10 +0300, Gleb Natapov wrote:
The hwrng is accessible by host userspace via /dev/mem.
Regular user has no access to /dev/mem, but he can start kvm guest and
gain access to the device.
Seriously. You guys are really trying hard to make our life hell or
what ? That
On Wed, 2013-10-02 at 17:37 +0300, Gleb Natapov wrote:
On Wed, Oct 02, 2013 at 04:33:18PM +0200, Paolo Bonzini wrote:
Il 02/10/2013 16:08, Alexander Graf ha scritto:
The hwrng is accessible by host userspace via /dev/mem.
A guest should live on the same permission level as a user
On Tue, 2013-10-01 at 11:39 +0300, Gleb Natapov wrote:
On Tue, Oct 01, 2013 at 06:34:26PM +1000, Michael Ellerman wrote:
On Thu, Sep 26, 2013 at 11:06:59AM +0200, Paolo Bonzini wrote:
Il 26/09/2013 08:31, Michael Ellerman ha scritto:
Some powernv systems include a hwrng. Guests can
On Tue, 2013-10-01 at 13:19 +0200, Paolo Bonzini wrote:
Il 01/10/2013 11:38, Benjamin Herrenschmidt ha scritto:
So for the sake of that dogma you are going to make us do something that
is about 100 times slower ? (and possibly involves more lines of code)
If it's 100 times slower
On Thu, 2013-09-26 at 16:31 +1000, Michael Ellerman wrote:
+ pr_info_once(registering arch random hook\n);
Either pr_debug or make it nicer looking :-)
Cheers,
Ben.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
On Thu, 2013-09-26 at 16:31 +1000, Michael Ellerman wrote:
+ pr_info(registered powernv hwrng.\n);
First letter of a line should get a capital :-) Also since
it's per-device, at least indicate the OF path or the chip number or
something ...
Cheers,
Ben.
--
To unsubscribe from this
early on, and set the thread priority to the medium level, so that
the interrupt handling code runs at a reasonable speed.
Signed-off-by: Paul Mackerras pau...@samba.org
Acked-by: Benjamin Herrenschmidt b...@kernel.crashing.org
Alex, can you take this via your tree ?
Cheers,
Ben
On Fri, 2013-09-13 at 10:17 +1000, Paul Mackerras wrote:
Aneesh and I are currently investigating an alternative approach,
which is much more like the x86 way of doing things. We are looking
at splitting the code into three modules: a kvm_pr.ko module with the
PR-specific bits, a kvm_hv.ko
On Tue, 2013-09-03 at 13:53 +0300, Gleb Natapov wrote:
Or supporting all IOMMU links (and leaving emulated stuff as is) in on
device is the last thing I have to do and then you'll ack the patch?
I am concerned more about API here. Internal implementation details I
leave to powerpc experts
On Tue, 2013-08-27 at 09:40 +0300, Gleb Natapov wrote:
Thanks. Since it's not in a topic branch that I can pull, I'm going to
just cherry-pick them. However, they are in your queue branch, not
next branch. Should I still assume this is a stable branch and that
the numbers aren't going to
On Tue, 2013-08-27 at 09:41 +0300, Gleb Natapov wrote:
Oh and Alexey mentions that there are two capabilities and you only
applied one :-)
Another one is:
[PATCH v8] KVM: PPC: reserve a capability and ioctl numbers for
realmode VFIO
?
Yes, thanks !
Cheers,
Ben.
--
To unsubscribe
On Mon, 2013-08-26 at 15:37 +0300, Gleb Natapov wrote:
Gleb, any chance you can put this (and the next one) into a tree to
lock in the numbers ?
Applied it. Sorry for slow response, was on vocation and still go
through the email backlog.
Thanks. Since it's not in a topic branch that I
On Fri, 2013-08-23 at 09:01 +0530, Aneesh Kumar K.V wrote:
Alexander Graf ag...@suse.de writes:
On 22.08.2013, at 12:37, Aneesh Kumar K.V wrote:
From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Isn't this you?
Yes. The patches are generated using git format-patch and sent by
On Wed, 2013-08-14 at 14:34 -0500, Scott Wood wrote:
On Wed, 2013-08-14 at 13:56 +0200, Alexander Graf wrote:
On 07.08.2013, at 04:05, Tiejun Chen wrote:
We enter with interrupts disabled in hardware, but we need to
call SOFT_DISABLE_INTS anyway to ensure that the software state
is
On Fri, 2013-08-02 at 17:58 -0500, Scott Wood wrote:
What about 64-bit PTEs on 32-bit kernels?
In any case, this code does not belong in KVM. It should be in the
main
PPC mm code, even if KVM is the only user.
Also don't we do similar things in BookS KVM ? At the very least that
sutff
On Sat, 2013-08-03 at 02:58 +, Bhushan Bharat-R65777 wrote:
One of the problem I saw was that if I put this code in
asm/pgtable-32.h and asm/pgtable-64.h then pte_persent() and other
friend function (on which this code depends) are defined in pgtable.h.
And pgtable.h includes
On Sat, 2013-08-03 at 03:11 +, Bhushan Bharat-R65777 wrote:
Could you explain why we need to set dirty/referenced on the PTE, when we
didn't
need to do that before? All we're getting from the PTE is wimg.
We have MMU notifiers to take care of the page being unmapped, and we've
On Fri, 2013-07-26 at 15:03 +, Bhushan Bharat-R65777 wrote:
Will not searching the Linux PTE is a overkill?
That's the best approach. Also we are searching it already to resolve
the page fault. That does mean we search twice but on the other hand
that also means it's hot in the cache.
On Wed, 2013-07-24 at 15:43 -0700, Andrew Morton wrote:
For what? The three lines of comment in page-flags.h? ack :)
Manipulating page-_count directly is considered poor form. Don't
blame us if we break your code ;)
Actually, the manipulation in realmode_get_page() duplicates the
On Mon, 2013-07-15 at 10:20 +0800, tiejun.chen wrote:
What about SOFT_IRQ_DISABLE? This is close to name
hard_irq_disable() :) And
then remove all DISABLE_INTS as well?
Or RECONCILE_IRQ_STATE...
Ben.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a
On Fri, 2013-07-12 at 12:50 -0500, Scott Wood wrote:
[1] SOFT_DISABLE_INTS seems an odd name for something that updates the
software state to be consistent with interrupts being *hard* disabled.
I can sort of see the logic in it, but it's confusing when first
encountered. From the
On Thu, 2013-07-11 at 11:49 +0200, Alexander Graf wrote:
Ben, is soft_enabled == 0; hard_enabled == 1 a valid combination that
may ever occur?
Yes of course, that's what we call soft disabled :-) It's even the
whole point of doing lazy disable...
Ben.
--
To unsubscribe from this list: send
On Thu, 2013-07-11 at 11:52 +0200, Alexander Graf wrote:
Where exactly (it is rather SPAPR_TCE_IOMMU but does not really
matter)?
Select it on KVM_BOOK3S_64? CONFIG_KVM_BOOK3S_64_HV?
CONFIG_KVM_BOOK3S_64_PR? PPC_BOOK3S_64?
I'd say the most logical choice would be to check the Makefile
On Thu, 2013-07-11 at 13:15 +0200, Alexander Graf wrote:
There are 2 ways of dealing with this:
1) Call the ENABLE_CAP on every vcpu. That way one CPU may handle
this hypercall in the kernel while another one may not. The same as we
handle PAPR today.
2) Create a new ENABLE_CAP for
On Thu, 2013-07-11 at 14:47 +0200, Alexander Graf wrote:
Yes of course, that's what we call soft disabled :-) It's even the
whole point of doing lazy disable...
Meh. Of course it's soft_enabled = 1; hard_enabled = 0.
That doesn't happen in normal C code. It happens under very specific
On Thu, 2013-07-11 at 14:50 +0200, Alexander Graf wrote:
Not really no. But that would do. You could have give a more useful
answer in the first place though rather than stringing him along.
Sorry, I figured it was obvious.
It wasn't no, because of the mess with modules and the nasty
On Thu, 2013-07-11 at 14:51 +0200, Alexander Graf wrote:
I don't like bloat usually. But Alexey even had an #ifdef DEBUG in
there to selectively disable in-kernel handling of multi-TCE. Not
calling ENABLE_CAP would give him exactly that without ugly #ifdefs in
the kernel.
I don't see much
On Thu, 2013-07-11 at 15:12 +1000, Alexey Kardashevskiy wrote:
Any debug code is prohibited? Ok, I'll remove.
Debug code that requires code changes is prohibited, yes.
Debug code that is runtime switchable (pr_debug, trace points, etc)
are allowed.
Bollox.
$ grep DBG\( arch/powerpc/
On Thu, 2013-07-11 at 12:11 +0200, Alexander Graf wrote:
So I must add one more ioctl to enable MULTITCE in kernel handling. Is it
what you are saying?
I can see KVM_CHECK_EXTENSION but I do not see KVM_ENABLE_EXTENSION or
anything like that.
KVM_ENABLE_CAP. It's how we enable sPAPR
On Thu, 2013-07-11 at 15:07 +0200, Alexander Graf wrote:
Ok, let me quickly explain the problem.
We are leaving host context, switching slowly into guest context.
During that transition we call get_paca() indirectly (apparently by
another call to hard_disable() which sounds bogus, but that's
On Thu, 2013-07-11 at 11:18 -0500, Scott Wood wrote:
If we set IRQs as soft-disabled prior to calling hard_irq_disable(),
then hard_irq_disable() will fail to call trace_hardirqs_off().
Sure because setting them as soft-disabled will have done it.
However by doing so, you also create the
On Fri, 2013-07-12 at 10:13 +0800, tiejun.chen wrote:
#define hard_irq_disable()do {\
u8 _was_enabled = get_paca()-soft_enabled; \
Current problem I met is issued from the above line.
__hard_irq_disable(); \
-
On Wed, 2013-07-10 at 12:33 +0200, Alexander Graf wrote:
It's not exactly obvious that you're calling it with writing == 1 :).
Can you create a new local variable is_write in the calling
function, set that to 1 before the call to get_user_pages_fast and
pass it in instead of the 1? The
On Thu, 2013-07-11 at 00:57 +0200, Alexander Graf wrote:
#ifdef CONFIG_PPC64
+ /*
+ * To avoid races, the caller must have gone directly from having
+ * interrupts fully-enabled to hard-disabled.
+ */
+ WARN_ON(local_paca-irq_happened != PACA_IRQ_HARD_DIS);
On Tue, 2013-07-09 at 18:02 +0200, Alexander Graf wrote:
On 07/06/2013 05:07 PM, Alexey Kardashevskiy wrote:
The existing TCE machine calls (tce_build and tce_free) only support
virtual mode as they call __raw_writeq for TCE invalidation what
fails in real mode.
This introduces
On Sun, 2013-07-07 at 01:07 +1000, Alexey Kardashevskiy wrote:
The current VFIO-on-POWER implementation supports only user mode
driven mapping, i.e. QEMU is sending requests to map/unmap pages.
However this approach is really slow, so we want to move that to KVM.
Since H_PUT_TCE can be
On Thu, 2013-07-04 at 06:47 +, Caraman Mihai Claudiu-B02008 wrote:
This is a solid reason. Ben it's ok for you to apply the combined
patch? If so I will respin it.
Sure, but nowadays, all that stuff goes via Scott and Alex.
Cheers,
Ben.
--
To unsubscribe from this list: send the line
On Tue, 2013-07-02 at 17:12 +0200, Alexander Graf wrote:
Is CMA a mandatory option in the kernel? Or can it be optionally
disabled? If it can be disabled, we should keep the preallocated
fallback case around for systems that have CMA disabled.
Why ? More junk code to keep around ...
If CMA
On Thu, 2013-06-27 at 16:59 +1000, Stephen Rothwell wrote:
+/* Allows an external user (for example, KVM) to unlock an IOMMU
group */
+static void vfio_group_del_external_user(struct file *filep)
+{
+ struct vfio_group *group = filep-private_data;
+
+ BUG_ON(filep-f_op !=
On Mon, 2013-06-24 at 13:54 +1000, David Gibson wrote:
DDW means an API by which the guest can request the creation of
additional iommus for a given device (typically, in addition to the
default smallish 32-bit one using 4k pages, the guest can request
a larger window in 64-bit space using
1 - 100 of 285 matches
Mail list logo