There're several types of PEs can be supported for now: PHB, Bus
and Device dependent PE. For PCI bus dependent PE, tracing the
corresponding PCI bus from PE (struct eeh_pe) would make the code
more efficient. The patch also enables the retrieval of PCI bus based
on the PCI bus dependent PE.
While doing EEH recovery, the PCI devices of the problematic PE
should be removed and then added to the system again. During the
so-called hotplug event, the PCI devices of the problematic PE
will be probed through early/late phase. We would delay EEH probe
on late point for PowerNV platform since
The patch adds new EEH operation post_init. It's used to notify
the platform that EEH core has completed the EEH probe. By that,
PowerNV platform starts to use the services supplied by EEH
functionality.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/include/asm/eeh.h |
The patch adds I/O chip backend to retrieve the state for the
indicated PE. While the PE state is temperarily unavailable,
the upper layer (powernv platform) should return default delay
(1 second).
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/eeh-ioda.c
On PowerNV platform, we might run into the situation where subsequent
events are duplicated events of former one, which is being processed.
For the case, we need the function implemented by the patch to purge
EEH events accordingly.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
The patch implements the backend for EEH core to retrieve next
EEH error to handle. For the informational errors, we won't bother
the EEH core. Otherwise, the EEH should take appropriate actions
depending on the return value:
0 - No further errors detected
1 - Frozen PE
2
The patch adds EEH backends for PowerNV platform. It's notable that
part of those EEH backends call to the I/O chip dependent backends.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/Makefile |2 +-
arch/powerpc/platforms/powernv/eeh-powernv.c |
The patch adds backends to retrieve error log and configure p2p
bridges for the indicated PE.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/eeh-ioda.c | 57 -
1 files changed, 55 insertions(+), 2 deletions(-)
diff --git
This patch implements a notifier to receive a notification on OPAL
event mask changes. The notifier is only called as a result of an OPAL
interrupt, which will happen upon reception of FSP messages or PCI errors.
Any event mask change detected as a result of opal_poll_events() will not
result in a
The patch registers OPAL event notifier and process the PCI errors
from firmware. If we have pending PCI errors, special EEH event
(without binding PE) will be sent to EEH core for processing.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/eeh-ioda.c |
One of the possible cases indicated by P7IOC interrupt is fenced
PHB. For that case, we need fetch the PE corresponding to the PHB
and disable the PHB and all subordinate PCI buses/devices, recover
from the fenced state and eventually enable the whole PHB. We need
one function to fetch the PHB PE
While we're restarting or powering off the system, we needn't
the OPAL notifier any more. So just to disable that.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/setup.c |4
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git
We possiblly have multiple kthreads running for multiple EEH errors
(events) and use one spinlock to make the process of handling those
EEH events serialized. That's unnecessary and the patch creates only
one kthread, which is started during EEH core initialization time in
eeh_init(). A new
The patch adds the I/O chip backend to do PE reset. For now, we
focus on PCI bus dependent PE. If PHB PE has been put into error
state, the PHB will take complete reset. Besides, the root bridge
will take fundamental or hot reset accordingly if the indicated
PE locates at the toppest of PCI
An EEH event is created and queued to the event queue for each
ingress EEH error. When there're mutiple EEH errors, we need serialize
the process to keep consistent PE state (flags). The spinlock
confirm_error_lock was introduced for the purpose. We'll inject
EEH event upon error reporting
The patch synchronizes OPAL APIs between kernel and firmware. Also,
we starts to replace opal_pci_get_phb_diag_data() with the similar
opal_pci_get_phb_diag_data2() and the former OPAL API would return
OPAL_UNSUPPORTED from now on.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
For EEH on PowerNV platform, we will do EEH probe based on the
real PCI devices. The PCI devices are available after PCI probe.
So we have to call eeh_init() explicitly on PowerNV platform
after PCI probe. The patch also does EEH probe for PowerNV platform
in eeh_init().
Signed-off-by: Gavin Shan
The patch adds the backend to enable or disable EEH functionality
for the specified PE. The backend is also used to enable MMIO or
DMA path for the problematic PE. It's notable that all PEs on
PowerNV platform support EEH functionality by default, and we
disallow to disable EEH for the specific
On PowerNV platform, the EEH event caused by interrupt won't have
binding PE. The patch enables EEH core to handle the special event.
To avoid the current logic we have, The eeh_handle_event() is renamed
to eeh_handle_normal_event(), and the eeh_handle_special_event() is
introduced. The function
The patch initializes EEH for PowerNV platform. Because the OPAL
APIs requires HUB ID, we need trace that through struct pnv_phb.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/pci-ioda.c | 16 +---
The patch creates one debugfs directory (powerpc/PCI) for
each PHB so that we can hook EEH error injection debugfs entry
there in proceeding patch.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/pci-ioda.c | 22 ++
The patch creates debugfs entries (powerpc/PCI/err_injct) for
injecting EEH errors for testing purpose.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/platforms/powernv/eeh-ioda.c | 31 +
1 files changed, 31 insertions(+), 0 deletions(-)
The post initialization (struct eeh_ops::post_init) is called after
the EEH probe is done. On the other hand, the EEH core post
initialization is designed to call platform and then I/O chip backend
on PowerNV platform.
The patch adds the backend for I/O chip to notify the platform
that the
The patch enables EEH check and let EEH core to process the EEH
errors for PowerNV platform while accessing config space. Originally,
the implementation already had mechanism to check EEH errors and
tried to recover from them. However, we never let EEH core to handle
the EEH errors.
We're not expecting that one specific PE got frozen for over 5
times in last hour. Otherwise, the PE will be removed from the
system upon newly coming EEH errors. The patch introduces time
stamp to trace the first error on specific PE in last hour and
function to update that accordingly. Besides,
While moving EEH core around from pSeries platform directory to
arch/powerpc/kernel (in previous one patch), there has lots of
complaints for coding style from git show. The patch is going
to fix them.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/kernel/eeh.c|
On Tue, Jun 18, 2013 at 04:33:24PM +0800, Gavin Shan wrote:
Hi Ben,
I resend the whole series of patches because some newly added patches among
it.
Initially, the series of patches is built based on 3.10.RC1 and the patchset
doesn't intend to enable EEH functionality for PHB3 for now.
For EEH on PowerNV platform, the overall architecture is different
from that on pSeries platform. In order to support multiple I/O chips
in future, we split EEH to 3 layers for PowerNV platform: EEH core,
platform layer, I/O layer. It would give EEH implementation on PowerNV
platform much more
It's meaningless to handle frozen PE if we already had fenced PHB.
The patch intends to check the PHB state before checking PE. If the
PHB has been put into fenced state, we need take care of that firstly.
Signed-off-by: Gavin Shan sha...@linux.vnet.ibm.com
---
arch/powerpc/kernel/eeh.c | 60
On Tue, Jun 18, 2013 at 4:09 AM, Mike Qiu qiud...@linux.vnet.ibm.com wrote:
δΊ 2013/6/10 8:49, Grant Likely ει:
Originally, irq_domain_associate_many() was designed to unwind the
mapped irqs on a failure of any individual association. However, that
proved to be a problem with certain IRQ
On Tue, Jun 18, 2013 at 2:25 AM, Michael Neuling mi...@neuling.org wrote:
Michael Neuling mi...@neuling.org wrote:
Grant,
In next-20130617 we are getting the below crash on POWER7. Bisecting,
points to this patch (d39046ec72 in next)
Also, reverting just d39046ec72 fixes the crash in
Grant Likely grant.lik...@linaro.org wrote:
On Tue, 18 Jun 2013 10:05:31 +0100, Grant Likely grant.lik...@linaro.org
wrote:
On Tue, Jun 18, 2013 at 2:25 AM, Michael Neuling mi...@neuling.org wrote:
Michael Neuling mi...@neuling.org wrote:
Grant,
In next-20130617 we are getting
On Tue, Jun 18, 2013 at 12:04 PM, Michael Neuling mi...@neuling.org wrote:
Grant Likely grant.lik...@linaro.org wrote:
On Tue, 18 Jun 2013 10:05:31 +0100, Grant Likely grant.lik...@linaro.org
wrote:
On Tue, Jun 18, 2013 at 2:25 AM, Michael Neuling mi...@neuling.org wrote:
Michael Neuling
Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com writes:
From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Book3E uses the hugepd at PMD level and don't encode pte directly
at the pmd level. So it will find the lower bits of pmd set
and the pmd_bad check throws error. Infact the current
On Tue, 2013-06-18 at 14:38 +1000, Benjamin Herrenschmidt wrote:
On Mon, 2013-06-17 at 20:32 -0600, Alex Williamson wrote:
Right, we don't want to create dependencies across modules. I don't
have a vision for how this should work. This is effectively a complete
side-band to vfio, so
On 06/17/2013 09:49:19 PM, Lian Minghuan-b31939 wrote:
On 06/18/2013 08:42 AM, Scott Wood wrote:
On 06/17/2013 07:28:07 PM, Scott Wood wrote:
On 06/17/2013 12:07:41 AM, Lian Minghuan-b31939 wrote:
+compatible = fsl,mpic-msi;
+reg = 0x41600 0x200 0x44140 4;
Why 0x200?
[Minghuan] The
On 06/17/2013 10:10:17 PM, Lian Minghuan-b31939 wrote:
Hi Scott,
please see my comments inline.
On 06/18/2013 08:18 AM, Scott Wood wrote:
On 06/17/2013 12:36:50 AM, Lian Minghuan-b31939 wrote:
Hi Scott,
please see my comments inline.
On 06/15/2013 06:13 AM, Scott Wood wrote:
On 06/14/2013
On Tue, Jun 18, 2013 at 2:10 AM, Scott Wood scottw...@freescale.com wrote:
On 06/17/2013 08:15:33 AM, Rojhalat Ibrahim wrote:
On Friday 14 June 2013 15:18:03 Scott Wood wrote:
On 83xx:
cc1: warnings being treated as errors
Benjamin Herrenschmidt b...@au1.ibm.com writes:
On Wed, 2013-06-05 at 20:58 +0530, Aneesh Kumar K.V wrote:
This is the second patchset needed to support THP on ppc64. Some of
the changes
included in this series are tricky in that it changes the powerpc
linux page table
walk subtly. We also
On 06/17/2013 09:34:49 PM, Lian Minghuan-b31939 wrote:
Hi Soctt,
please see my comments inline.
On 06/18/2013 08:15 AM, Scott Wood wrote:
On 06/16/2013 10:00:01 PM, Lian Minghuan-b31939 wrote:
Hi Scott,
please see my comments inline.
On 06/15/2013 06:09 AM, Scott Wood wrote:
On 06/14/2013
Benjamin Herrenschmidt b...@kernel.crashing.org writes:
On Sun, 2013-06-16 at 13:37 +1000, Benjamin Herrenschmidt wrote:
On Sun, 2013-06-16 at 12:00 +1000, Benjamin Herrenschmidt wrote:
So at this point, hash_page might *still* see the old pmd. Unless I
missed something, you did nothing
This fixes a regression that causes 83xx to oops on boot if a
non-express PCI bus is present.
The following changes since commit 17858ca65eef148d335ffd4cfc09228a1c1cbfb5:
Merge tag 'please-pull-fixia64' of
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux (2013-06-18 06:29:19
-1000)
On 13-06-17 04:10 PM, Paul Gortmaker wrote:
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 (x86: Fix bit corruption at CPU
On Tue, 2013-06-18 at 08:48 -0600, Alex Williamson wrote:
On Tue, 2013-06-18 at 14:38 +1000, Benjamin Herrenschmidt wrote:
On Mon, 2013-06-17 at 20:32 -0600, Alex Williamson wrote:
Right, we don't want to create dependencies across modules. I don't
have a vision for how this should
On Wed, 2013-06-19 at 00:16 +0530, Aneesh Kumar K.V wrote:
But will that by anonymous memory ? ie, will we find them suitable for
THP allocation ?
The 4k pages themselves with 4k_PFN no, but the segment yes. A single of
these will demote the whole segment, ie 256M or 1T.
* If you find a
On 06/16/2013 02:39 PM, Benjamin Herrenschmidt wrote:
static pte_t kvmppc_lookup_pte(pgd_t *pgdir, unsigned long hva, bool
writing,
-unsigned long *pte_sizep)
+unsigned long *pte_sizep, bool do_get_page)
{
pte_t *ptep;
unsigned int shift
Benjamin Herrenschmidt b...@kernel.crashing.org writes:
On Wed, 2013-06-19 at 00:16 +0530, Aneesh Kumar K.V wrote:
But will that by anonymous memory ? ie, will we find them suitable for
THP allocation ?
The 4k pages themselves with 4k_PFN no, but the segment yes. A single of
these will
Alex Williamson alex.william...@redhat.com writes:
On Mon, 2013-06-17 at 13:56 +1000, Benjamin Herrenschmidt wrote:
On Sun, 2013-06-16 at 21:13 -0600, Alex Williamson wrote:
IOMMU groups themselves don't provide security, they're accessed by
interfaces like VFIO, which provide the
On Mon, Jun 17, 2013 at 05:42:13PM +1000, Michael Ellerman wrote:
On Sat, Jun 15, 2013 at 12:02:21PM +1000, Benjamin Herrenschmidt wrote:
On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
I was pretty much able to reproduce this on my PA Semi PPC box. Funny
thing is, when I type on
Sukadev Bhattiprolu suka...@linux.vnet.ibm.com wrote:
From 9f1a8a16e0ef36447e343d1cd4797c2b6a81225f Mon Sep 17 00:00:00 2001
From: Sukadev Bhattiprolu suka...@linux.vnet.ibm.com
Date: Fri, 7 Jun 2013 13:26:31 -0700
Subject: [PATCH 2/2] perf: Add support for the mem_xlvl field.
A follow-on
Suka,
One of these two patches breaks pmac32_defconfig and I suspect all other
32 bit configs (against mainline)
arch/powerpc/perf/core-book3s.c: In function 'record_and_restart':
arch/powerpc/perf/core-book3s.c:1632:4: error: passing argument 1 of
'ppmu-get_mem_data_src' from incompatible
On Wed, 2013-06-19 at 13:05 +0930, Rusty Russell wrote:
symbol_get() won't try to load a module; it'll just fail. This is what
you want, since they must have vfio in the kernel to get a valid fd...
Ok, cool. I suppose what we want here Alexey is slightly higher level,
something like:
Michael Neuling [mi...@neuling.org] wrote:
| Suka,
|
| One of these two patches breaks pmac32_defconfig and I suspect all other
| 32 bit configs (against mainline)
|
| arch/powerpc/perf/core-book3s.c: In function 'record_and_restart':
| arch/powerpc/perf/core-book3s.c:1632:4: error: passing
Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote:
From: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
Book3E uses the hugepd at PMD level and don't encode pte directly
at the pmd level. So it will find the lower bits of pmd set
and the pmd_bad check throws error. Infact the current
54 matches
Mail list logo