Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Benjamin Herrenschmidt
On Thu, 2013-07-25 at 15:51 +1000, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 13:24 +0800, Chen Gang wrote:
  For an extern function, if the performance is not sensible, better to
  have the return value which can indicate the failure with the negative
  number.
 
 The return value is meaningless.
 
 We don't have a good way to handle it. It has no defined semantics. What
 does failure means in that case ? Nothing !
 
 So just remove it.

Note: If you want to create a concept of smp_ops-probe() failing, then
not only you need to check all the implementations, but *also* add
something sensible to do when it fails ... such as disabling bringup of
CPUs.

In this case however, we have put the burden of doing whatever makes
sense in the probe() function itself. If can adjust the possible map if
it fails.

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: Inbound PCI and Memory Corruption

2013-07-25 Thread Peter LaDow
On Wed, Jul 24, 2013 at 3:08 PM, Benjamin Herrenschmidt
b...@au1.ibm.com wrote:
 No, they resolve to the same thing under the hood. Did you do other
 changes ? Could it be another unrelated kernel bug causing something
 like use-after-free of network buffer or similar oddity unrelated to the
 network driver ?

There are other items, such as drivers for our custom hardware modules
implemented on the FPGA.  Perhaps I'll pull our drivers and run a
stock kernel.  Maybe a stock 83xx configuration (such as the
MPC8349E-MITX).  If we have problems even on a stock configuration...

 Have you tried with different kernel versions ?

Funny you mention it.  I just tried 3.10.2 today and we still get the
same memory corruption.  I was hoping that perhaps something had
changed between 3.0 and 3.10 that might clear up the problem, and then
I could bisect to find where it failed.  But unfortunately, 3.10.2
exhibits the same issue.

So clearly this isn't an issue specific to the kernel version.  Though
the e1000 driver looks largely unchanged in 3.10.  So if the problem
is driver related, it would still be there.

Thanks,
Pete
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Chen Gang
On 07/25/2013 01:51 PM, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 13:24 +0800, Chen Gang wrote:
 For an extern function, if the performance is not sensible, better to
 have the return value which can indicate the failure with the negative
 number.
 
 The return value is meaningless.
 
 We don't have a good way to handle it. It has no defined semantics. What
 does failure means in that case ? Nothing !
 
 So just remove it.
 

Hmm... for an extern function (espeically have been implemented in
various modules), normally, we can assume it may fail in some cases
(although now, we don't know what cases can cause its failure).

If we don't have a good way to handle the failure, print the related
warning message is an executable choice (or BUG_ON(), if it is critical).

So, if the performance is not sensible, I still suggest to let extern
function have return value.


Thanks.
-- 
Chen Gang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Chen Gang
On 07/25/2013 02:03 PM, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 15:51 +1000, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 13:24 +0800, Chen Gang wrote:
 For an extern function, if the performance is not sensible, better to
 have the return value which can indicate the failure with the negative
 number.

 The return value is meaningless.

 We don't have a good way to handle it. It has no defined semantics. What
 does failure means in that case ? Nothing !

 So just remove it.
 
 Note: If you want to create a concept of smp_ops-probe() failing, then
 not only you need to check all the implementations, but *also* add
 something sensible to do when it fails ... such as disabling bringup of
 CPUs.
 

Hmm... if critical, use BUG(), else (none critical), just print a
warning message ?

 In this case however, we have put the burden of doing whatever makes
 sense in the probe() function itself. If can adjust the possible map if
 it fails.
 

Excuse me, my English is not quite well, I guss your meaning is: it can
be fail in internal implementation, but has no effect with the final
result to caller, is it correct ?

If what I understand is correct, it needn't let caller know about it.


Thanks.
-- 
Chen Gang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Benjamin Herrenschmidt
On Thu, 2013-07-25 at 14:17 +0800, Chen Gang wrote:
 
 Hmm... for an extern function (espeically have been implemented in
 various modules), normally, we can assume it may fail in some cases
 (although now, we don't know what cases can cause its failure).
 
 If we don't have a good way to handle the failure, print the related
 warning message is an executable choice (or BUG_ON(), if it is critical).
 
 So, if the performance is not sensible, I still suggest to let extern
 function have return value.

This is not a module function. We are not doing a uni course on how to
write C code here. Be real.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Chen Gang
On 07/25/2013 03:33 PM, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 14:17 +0800, Chen Gang wrote:
  
  Hmm... for an extern function (espeically have been implemented in
  various modules), normally, we can assume it may fail in some cases
  (although now, we don't know what cases can cause its failure).
  
  If we don't have a good way to handle the failure, print the related
  warning message is an executable choice (or BUG_ON(), if it is 
  critical).
  
  So, if the performance is not sensible, I still suggest to let extern
  function have return value.
 This is not a module function. We are not doing a uni course on how to
 write C code here. Be real.

In our case, 'module' points to various sub directories of arch/powerpc
(maybe 'module' is not quite precise, it is easy misunderstand).

The real world is not conflict with how to write C code.

For my opinion: one fix may like below (assume have removed max_cpus)
which is more reasonable for code readers.

-diff begin--

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 7edbd5b..53155f4 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -347,7 +347,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
cpumask_set_cpu(boot_cpuid, cpu_core_mask(boot_cpuid));
 
if (smp_ops  smp_ops-probe)
-   smp_ops-probe();
+   BUG_ON(smp_ops-probe()  0);
 }
 
 void smp_prepare_boot_cpu(void)

-diff end


Thanks
-- 
Chen Gang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Benjamin Herrenschmidt
On Thu, 2013-07-25 at 15:59 +0800, Chen Gang wrote:
 
 For my opinion: one fix may like below (assume have removed max_cpus)
 which is more reasonable for code readers.

So instead of just failing to bring the secondary CPUs, but potentially
still having a working system, you crash during boot potentially
before a console is even visible. And this is good how ?

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Chen Gang
On 07/25/2013 04:06 PM, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 15:59 +0800, Chen Gang wrote:

 For my opinion: one fix may like below (assume have removed max_cpus)
 which is more reasonable for code readers.
 
 So instead of just failing to bring the secondary CPUs, but potentially
 still having a working system, you crash during boot potentially
 before a console is even visible. And this is good how ?
 

Hmm... how about the above DBG(...) within this function ?

One implementation of BUG_ON() is use printk() and coredump, if it is a
critical failure, I suggest to use it (if console is really invisible, I
guess still can generate the coredump).

Hmm... But do you mean it really can be failed, but it is not a critical
failure ? if so we need print the related warning message instead of.

Thanks.
-- 
Chen Gang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Benjamin Herrenschmidt
On Thu, 2013-07-25 at 16:22 +0800, Chen Gang wrote:
 On 07/25/2013 04:06 PM, Benjamin Herrenschmidt wrote:
  On Thu, 2013-07-25 at 15:59 +0800, Chen Gang wrote:
 
  For my opinion: one fix may like below (assume have removed max_cpus)
  which is more reasonable for code readers.
  
  So instead of just failing to bring the secondary CPUs, but potentially
  still having a working system, you crash during boot potentially
  before a console is even visible. And this is good how ?
  
 
 Hmm... how about the above DBG(...) within this function ?
 
 One implementation of BUG_ON() is use printk() and coredump, if it is a
 critical failure, I suggest to use it (if console is really invisible, I
 guess still can generate the coredump).

Whatever ... looks like you don't feel like listening so I'm not going
to waste my breath anymore, nor will I accept your patches.

Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH v2] powerpc: kernel: remove useless code which related with 'max_cpus'

2013-07-25 Thread Chen Gang
On 07/25/2013 04:28 PM, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 16:22 +0800, Chen Gang wrote:
  On 07/25/2013 04:06 PM, Benjamin Herrenschmidt wrote:
   On Thu, 2013-07-25 at 15:59 +0800, Chen Gang wrote:
  
   For my opinion: one fix may like below (assume have removed max_cpus)
   which is more reasonable for code readers.
   
   So instead of just failing to bring the secondary CPUs, but potentially
   still having a working system, you crash during boot potentially
   before a console is even visible. And this is good how ?
   
  
  Hmm... how about the above DBG(...) within this function ?
  
  One implementation of BUG_ON() is use printk() and coredump, if it is a
  critical failure, I suggest to use it (if console is really invisible, I
  guess still can generate the coredump).
 Whatever ... looks like you don't feel like listening so I'm not going
 to waste my breath anymore, nor will I accept your patches.

I can understand,

But 'patch' or 'patches' ?  ;-)


Thanks.
-- 
Chen Gang
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-25 Thread Gleb Natapov
On Wed, Jul 24, 2013 at 03:32:49PM -0500, Scott Wood wrote:
 On 07/24/2013 04:39:59 AM, Alexander Graf wrote:
 
 On 24.07.2013, at 11:35, Gleb Natapov wrote:
 
  On Wed, Jul 24, 2013 at 11:21:11AM +0200, Alexander Graf wrote:
  Are not we going to use page_is_ram() from
 e500_shadow_mas2_attrib() as Scott commented?
 
  rWhy aren't we using page_is_ram() in kvm_is_mmio_pfn()?
 
 
  Because it is much slower and, IIRC, actually used to build pfn
 map that allow
  us to check quickly for valid pfn.
 
 Then why should we use page_is_ram()? :)
 
 I really don't want the e500 code to diverge too much from what
 the rest of the kvm code is doing.
 
 I don't understand actually used to build pfn map  What code
 is this?  I don't see any calls to page_is_ram() in the KVM code, or
 in generic mm code.  Is this a statement about what x86 does?
It may be not page_is_ram() directly, but the same into page_is_ram() is
using. On power both page_is_ram() and do_init_bootmem() walks some kind
of memblock_region data structure. What important is that pfn_valid()
does not mean that there is a memory behind page structure. See Andrea's
reply.

 
 On PPC page_is_ram() is only called (AFAICT) for determining what
 attributes to set on mmaps.  We want to be sure that KVM always
 makes the same decision.  While pfn_valid() seems like it should be
 equivalent, it's not obvious from the PPC code that it is.
 
Again pfn_valid() is not enough.

 If pfn_valid() is better, why is that not used for mmap?  Why are
 there two different names for the same thing?
 
They are not the same thing. page_is_ram() tells you if phys address is
ram backed. pfn_valid() tells you if there is struct page behind the
pfn. PageReserved() tells if you a pfn is marked as reserved. All non
ram pfns should be reserved, but ram pfns can be reserved too. Again,
see Andrea's reply.

Why ppc uses page_is_ram() for mmap? How should I know? But looking at
the function it does it only as a fallback if
ppc_md.phys_mem_access_prot() is not provided. Making access to MMIO
noncached as a safe fallback makes sense. It is also make sense to allow
noncached access to reserved ram sometimes.

--
Gleb.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 1/5] powerpc: Free up the IPI message slot of ipi call function (PPC_MSG_CALL_FUNC)

2013-07-25 Thread Preeti U Murthy
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE
map to a common implementation - generic_smp_call_function_single_interrupt().
So, we can consolidate them and save one of the IPI message slots, (which are
precious, since only 4 of those slots are available).

So, implement the functionality of PPC_MSG_CALL_FUNC using
PPC_MSG_CALL_FUNC_SINGLE itself and release its IPI message slot, so that it
can be used for something else in the future, if desired.

Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/include/asm/smp.h  |2 +-
 arch/powerpc/kernel/smp.c   |   12 +---
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index ffbaabe..51bf017 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -117,7 +117,7 @@ extern int cpu_to_core_id(int cpu);
  *
  * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
  * in /proc/interrupts will be wrong!!! --Troy */
-#define PPC_MSG_CALL_FUNCTION   0
+#define PPC_MSG_UNUSED 0
 #define PPC_MSG_RESCHEDULE  1
 #define PPC_MSG_CALL_FUNC_SINGLE   2
 #define PPC_MSG_DEBUGGER_BREAK  3
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 38b0ba6..bc41e9f 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -111,9 +111,9 @@ int smp_generic_kick_cpu(int nr)
 }
 #endif /* CONFIG_PPC64 */
 
-static irqreturn_t call_function_action(int irq, void *data)
+static irqreturn_t unused_action(int irq, void *data)
 {
-   generic_smp_call_function_interrupt();
+   /* This slot is unused and hence available for use, if needed */
return IRQ_HANDLED;
 }
 
@@ -144,14 +144,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 }
 
 static irq_handler_t smp_ipi_action[] = {
-   [PPC_MSG_CALL_FUNCTION] =  call_function_action,
+   [PPC_MSG_UNUSED] =  unused_action, /* Slot available for future use */
[PPC_MSG_RESCHEDULE] = reschedule_action,
[PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
-   [PPC_MSG_CALL_FUNCTION] =  ipi call function,
+   [PPC_MSG_UNUSED] =  ipi unused,
[PPC_MSG_RESCHEDULE] = ipi reschedule,
[PPC_MSG_CALL_FUNC_SINGLE] = ipi call function single,
[PPC_MSG_DEBUGGER_BREAK] = ipi debugger,
@@ -221,8 +221,6 @@ irqreturn_t smp_ipi_demux(void)
all = xchg(info-messages, 0);
 
 #ifdef __BIG_ENDIAN
-   if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNCTION)))
-   generic_smp_call_function_interrupt();
if (all  (1  (24 - 8 * PPC_MSG_RESCHEDULE)))
scheduler_ipi();
if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE)))
@@ -265,7 +263,7 @@ void arch_send_call_function_ipi_mask(const struct cpumask 
*mask)
unsigned int cpu;
 
for_each_cpu(cpu, mask)
-   do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
+   do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE);
 }
 
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.c
index 2d42f3b..28166e4 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -213,7 +213,7 @@ static void iic_request_ipi(int msg)
 
 void iic_request_IPIs(void)
 {
-   iic_request_ipi(PPC_MSG_CALL_FUNCTION);
+   iic_request_ipi(PPC_MSG_UNUSED);
iic_request_ipi(PPC_MSG_RESCHEDULE);
iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE);
iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 4b35166..488f069 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -74,7 +74,7 @@ static int __init ps3_smp_probe(void)
* to index needs to be setup.
*/
 
-   BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION!= 0);
+   BUILD_BUG_ON(PPC_MSG_UNUSED   != 0);
BUILD_BUG_ON(PPC_MSG_RESCHEDULE   != 1);
BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2);
BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK   != 3);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 0/5] cpuidle/ppc: Timer offload framework to support deep idle states

2013-07-25 Thread Preeti U Murthy
On PowerPC, when CPUs enter deep idle states, their local timers are
switched off. The responsibility of waking them up at their next timer event,
needs to be handed over to an external device. On PowerPC, we do not have an
external device equivalent to HPET, which is currently done on architectures
like x86. Instead we assign the local timer of one of the CPUs to do this job.

This patchset is an attempt to make use of the existing timer broadcast
framework in the kernel to meet the above requirement, except that the tick
broadcast device is the local timer of the boot CPU.

This patch series is ported ontop of 3.11-rc1 + the cpuidle driver backend
for powernv posted by Deepthi Dharwar recently. The current design and
implementation supports the ONESHOT tick mode. It does not yet support
the PERIODIC tick mode. This patch is tested with NOHZ_FULL off.

Patch[1/5], Patch[2/5]: optimize the broadcast mechanism on ppc.
Patch[3/5]: Introduces the core of the timer offload framework on powerpc.
Patch[4/5]: The cpu doing the broadcast should not go into tickless idle.
Patch[5/5]: Add a deep idle state to the cpuidle state table on powernv.

Patch[5/5] is the patch that ultimately makes use of the timer offload
framework that the patches Patch[1/5] to Patch[4/5] build.

---

Preeti U Murthy (3):
  cpuidle/ppc: Add timer offload framework to support deep idle states
  cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints
  cpuidle/ppc: Add longnap state to the idle states on powernv

Srivatsa S. Bhat (2):
  powerpc: Free up the IPI message slot of ipi call function 
(PPC_MSG_CALL_FUNC)
  powerpc: Implement broadcast timer interrupt as an IPI message


 arch/powerpc/include/asm/smp.h  |3 +
 arch/powerpc/include/asm/time.h |3 +
 arch/powerpc/kernel/smp.c   |   23 --
 arch/powerpc/kernel/time.c  |   84 +++
 arch/powerpc/platforms/cell/interrupt.c |2 -
 arch/powerpc/platforms/powernv/Kconfig  |1 
 arch/powerpc/platforms/powernv/processor_idle.c |   48 +
 arch/powerpc/platforms/ps3/smp.c|2 -
 kernel/time/tick-sched.c|7 ++
 9 files changed, 161 insertions(+), 12 deletions(-)

-- 
Signature

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 2/5] powerpc: Implement broadcast timer interrupt as an IPI message

2013-07-25 Thread Preeti U Murthy
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

For scalability and performance reasons, we want the broadcast timer
interrupts to be handled as efficiently as possible. Fixed IPI messages
are one of the most efficient mechanisms available - they are faster
than the smp_call_function mechanism because the IPI handlers are fixed
and hence they don't involve costly operations such as adding IPI handlers
to the target CPU's function queue, acquiring locks for synchronization etc.

Luckily we have an unused IPI message slot, so use that to implement
broadcast timer interrupts efficiently.

Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/include/asm/smp.h  |3 ++-
 arch/powerpc/kernel/smp.c   |   19 +++
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 4 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 51bf017..d877b69 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -117,7 +117,7 @@ extern int cpu_to_core_id(int cpu);
  *
  * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
  * in /proc/interrupts will be wrong!!! --Troy */
-#define PPC_MSG_UNUSED 0
+#define PPC_MSG_TIMER  0
 #define PPC_MSG_RESCHEDULE  1
 #define PPC_MSG_CALL_FUNC_SINGLE   2
 #define PPC_MSG_DEBUGGER_BREAK  3
@@ -190,6 +190,7 @@ extern struct smp_ops_t *smp_ops;
 
 extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
+extern void arch_send_tick_broadcast(const struct cpumask *mask);
 
 /* Definitions relative to the secondary CPU spin loop
  * and entry point. Not all of them exist on both 32 and
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index bc41e9f..6a68ca4 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -35,6 +35,7 @@
 #include asm/ptrace.h
 #include linux/atomic.h
 #include asm/irq.h
+#include asm/hw_irq.h
 #include asm/page.h
 #include asm/pgtable.h
 #include asm/prom.h
@@ -111,9 +112,9 @@ int smp_generic_kick_cpu(int nr)
 }
 #endif /* CONFIG_PPC64 */
 
-static irqreturn_t unused_action(int irq, void *data)
+static irqreturn_t timer_action(int irq, void *data)
 {
-   /* This slot is unused and hence available for use, if needed */
+   timer_interrupt();
return IRQ_HANDLED;
 }
 
@@ -144,14 +145,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 }
 
 static irq_handler_t smp_ipi_action[] = {
-   [PPC_MSG_UNUSED] =  unused_action, /* Slot available for future use */
+   [PPC_MSG_TIMER] =  timer_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
[PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
-   [PPC_MSG_UNUSED] =  ipi unused,
+   [PPC_MSG_TIMER] =  ipi timer,
[PPC_MSG_RESCHEDULE] = ipi reschedule,
[PPC_MSG_CALL_FUNC_SINGLE] = ipi call function single,
[PPC_MSG_DEBUGGER_BREAK] = ipi debugger,
@@ -221,6 +222,8 @@ irqreturn_t smp_ipi_demux(void)
all = xchg(info-messages, 0);
 
 #ifdef __BIG_ENDIAN
+   if (all  (1  (24 - 8 * PPC_MSG_TIMER)))
+   timer_interrupt();
if (all  (1  (24 - 8 * PPC_MSG_RESCHEDULE)))
scheduler_ipi();
if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE)))
@@ -266,6 +269,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask 
*mask)
do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE);
 }
 
+void arch_send_tick_broadcast(const struct cpumask *mask)
+{
+   unsigned int cpu;
+
+   for_each_cpu(cpu, mask)
+   do_message_pass(cpu, PPC_MSG_TIMER);
+}
+
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 void smp_send_debugger_break(void)
 {
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.c
index 28166e4..1359113 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -213,7 +213,7 @@ static void iic_request_ipi(int msg)
 
 void iic_request_IPIs(void)
 {
-   iic_request_ipi(PPC_MSG_UNUSED);
+   iic_request_ipi(PPC_MSG_TIMER);
iic_request_ipi(PPC_MSG_RESCHEDULE);
iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE);
iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 488f069..5cb742a 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -74,7 +74,7 @@ static int __init ps3_smp_probe(void)
* to index needs to be setup.
*/
 
-   BUILD_BUG_ON(PPC_MSG_UNUSED 

[RFC PATCH 3/5] cpuidle/ppc: Add timer offload framework to support deep idle states

2013-07-25 Thread Preeti U Murthy
On ppc, in deep idle states, the lapic of the cpus gets switched off.
Hence make use of the broadcast framework to wakeup cpus in sleep state,
except that on ppc, we do not have an external device such as HPET, but
we use the lapic of a cpu itself as the broadcast device.

Instantiate two different clock event devices, one representing the
lapic and another representing the broadcast device for each cpu.
Such a cpu is forbidden to enter the deep idle state. The cpu which hosts
the broadcast device will be referred to as the broadcast cpu in the
changelogs of this patchset for convenience.

For now, only the boot cpu's broadcast device gets registered as a clock event
device along with the lapic. Hence this is the broadcast cpu.

On the broadcast cpu, on each timer interrupt, apart from the regular lapic 
event
handler the broadcast handler is also called. We avoid the overhead of
programming the lapic for a broadcast event specifically. The reason is
prevent multiple cpus from sending IPIs to program the lapic of the broadcast
cpu for their next local event each time they go to deep idle state.

Apart from this there is no change in the way broadcast is handled today. On
a broadcast ipi the event handler for a timer interrupt is called on the cpu
in deep idle state to handle the local events.

The current design and implementation of the timer offload framework supports
the ONESHOT tick mode but not the PERIODIC mode.

Signed-off-by: Preeti U. Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/include/asm/time.h|3 +
 arch/powerpc/kernel/smp.c  |4 +-
 arch/powerpc/kernel/time.c |   79 
 arch/powerpc/platforms/powernv/Kconfig |1 
 4 files changed, 84 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index c1f2676..936be0d 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -24,14 +24,17 @@ extern unsigned long tb_ticks_per_jiffy;
 extern unsigned long tb_ticks_per_usec;
 extern unsigned long tb_ticks_per_sec;
 extern struct clock_event_device decrementer_clockevent;
+extern struct clock_event_device broadcast_clockevent;
 
 struct rtc_time;
 extern void to_tm(int tim, struct rtc_time * tm);
 extern void GregorianDay(struct rtc_time *tm);
+extern void decrementer_timer_interrupt(void);
 
 extern void generic_calibrate_decr(void);
 
 extern void set_dec_cpu6(unsigned int val);
+extern int bc_cpu;
 
 /* Some sane defaults: 125 MHz timebase, 1GHz processor */
 extern unsigned long ppc_proc_freq;
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6a68ca4..d3b7014 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -114,7 +114,7 @@ int smp_generic_kick_cpu(int nr)
 
 static irqreturn_t timer_action(int irq, void *data)
 {
-   timer_interrupt();
+   decrementer_timer_interrupt();
return IRQ_HANDLED;
 }
 
@@ -223,7 +223,7 @@ irqreturn_t smp_ipi_demux(void)
 
 #ifdef __BIG_ENDIAN
if (all  (1  (24 - 8 * PPC_MSG_TIMER)))
-   timer_interrupt();
+   decrementer_timer_interrupt();
if (all  (1  (24 - 8 * PPC_MSG_RESCHEDULE)))
scheduler_ipi();
if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE)))
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 65ab9e9..8ed0fb3 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -42,6 +42,7 @@
 #include linux/timex.h
 #include linux/kernel_stat.h
 #include linux/time.h
+#include linux/timer.h
 #include linux/init.h
 #include linux/profile.h
 #include linux/cpu.h
@@ -97,8 +98,11 @@ static struct clocksource clocksource_timebase = {
 
 static int decrementer_set_next_event(unsigned long evt,
  struct clock_event_device *dev);
+static int broadcast_set_next_event(unsigned long evt,
+ struct clock_event_device *dev);
 static void decrementer_set_mode(enum clock_event_mode mode,
 struct clock_event_device *dev);
+static void decrementer_timer_broadcast(const struct cpumask *mask);
 
 struct clock_event_device decrementer_clockevent = {
.name   = decrementer,
@@ -106,13 +110,26 @@ struct clock_event_device decrementer_clockevent = {
.irq= 0,
.set_next_event = decrementer_set_next_event,
.set_mode   = decrementer_set_mode,
-   .features   = CLOCK_EVT_FEAT_ONESHOT,
+   .broadcast  = decrementer_timer_broadcast,
+   .features   = CLOCK_EVT_FEAT_C3STOP | CLOCK_EVT_FEAT_ONESHOT,
 };
 EXPORT_SYMBOL(decrementer_clockevent);
 
+struct clock_event_device broadcast_clockevent = {
+   .name   = broadcast,
+   .rating = 200,
+   .irq= 0,
+   .set_next_event = broadcast_set_next_event,
+   

[RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Preeti U Murthy
In the current design of timer offload framework, the broadcast cpu should
*not* go into tickless idle so as to avoid missed wakeups on CPUs in deep idle 
states.

Since we prevent the CPUs entering deep idle states from programming the lapic 
of the
broadcast cpu for their respective next local events for reasons mentioned in
PATCH[3/5], the broadcast CPU checks if there are any CPUs to be woken up during
each of its timer interrupt programmed to its local events.

With tickless idle, the broadcast CPU might not get a timer interrupt till after
many ticks which can result in missed wakeups on CPUs in deep idle states. By
disabling tickless idle, worst case, the tick_sched hrtimer will trigger a
timer interrupt every period to check for broadcast.

However the current setup of tickless idle does not let us make the choice
of tickless on individual cpus. NOHZ_MODE_INACTIVE which disables tickless idle,
is a system wide setting. Hence resort to an arch specific call to check if a 
cpu
can go into tickless idle.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/kernel/time.c |5 +
 kernel/time/tick-sched.c   |7 +++
 2 files changed, 12 insertions(+)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 8ed0fb3..68a636f 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -862,6 +862,11 @@ static void decrementer_timer_broadcast(const struct 
cpumask *mask)
arch_send_tick_broadcast(mask);
 }
 
+int arch_can_stop_idle_tick(int cpu)
+{
+   return cpu != bc_cpu;
+}
+
 static void register_decrementer_clockevent(int cpu)
 {
struct clock_event_device *dec = per_cpu(decrementers, cpu);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6960172..e9ffa84 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -700,8 +700,15 @@ static void tick_nohz_full_stop_tick(struct tick_sched *ts)
 #endif
 }
 
+int __weak arch_can_stop_idle_tick(int cpu)
+{
+   return 1;
+}
+
 static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
 {
+   if (!arch_can_stop_idle_tick(cpu))
+   return false;
/*
 * If this cpu is offline and it is the one which updates
 * jiffies, then give up the assignment and let it be taken by

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[RFC PATCH 5/5] cpuidle/ppc: Add longnap state to the idle states on powernv

2013-07-25 Thread Preeti U Murthy
This patch hooks into the existing broadcast framework with the support that 
this
patchset introduces for ppc, and the cpuidle driver backend
for powernv(posted out recently by Deepthi Dharwar) to add sleep state as
one of the deep idle states, in which the decrementer is switched off.

However in this patch, we only emulate sleep by going into a state which does
a nap with the decrementer interrupts disabled, termed as longnap. This enables
focus on the timer broadcast framework for ppc in this series of patches ,
which is required as a first step to enable sleep on ppc.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/platforms/powernv/processor_idle.c |   48 +++
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/processor_idle.c 
b/arch/powerpc/platforms/powernv/processor_idle.c
index f43ad91a..9aca502 100644
--- a/arch/powerpc/platforms/powernv/processor_idle.c
+++ b/arch/powerpc/platforms/powernv/processor_idle.c
@@ -9,16 +9,18 @@
 #include linux/cpuidle.h
 #include linux/cpu.h
 #include linux/notifier.h
+#include linux/clockchips.h
 
 #include asm/machdep.h
 #include asm/runlatch.h
+#include asm/time.h
 
 struct cpuidle_driver powernv_idle_driver = {
.name = powernv_idle,
.owner =THIS_MODULE,
 };
 
-#define MAX_IDLE_STATE_COUNT   2
+#define MAX_IDLE_STATE_COUNT   3
 
 static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;
 static struct cpuidle_device __percpu *powernv_cpuidle_devices;
@@ -54,6 +56,43 @@ static int nap_loop(struct cpuidle_device *dev,
return index;
 }
 
+/* Emulate sleep, with long nap.
+ * During sleep, the core does not receive decrementer interrupts.
+ * Emulate sleep using long nap with decrementers interrupts disabled.
+ * This is an initial prototype to test the timer offload framework for ppc.
+ * We will eventually introduce the sleep state once the timer offload 
framework
+ * for ppc is stable.
+ */
+static int longnap_loop(struct cpuidle_device *dev,
+   struct cpuidle_driver *drv,
+   int index)
+{
+   int cpu = dev-cpu;
+
+   unsigned long lpcr = mfspr(SPRN_LPCR);
+
+   lpcr = ~(LPCR_MER | LPCR_PECE); /* lpcr[mer] must be 0 */
+
+   /* exit powersave upon external interrupt, but not decrementer
+* interrupt, Emulate sleep.
+*/
+   lpcr |= LPCR_PECE0;
+
+   if (cpu != bc_cpu) {
+   mtspr(SPRN_LPCR, lpcr);
+   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
+   power7_nap();
+   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
+   } else {
+   /* Wakeup on a decrementer interrupt, Do a nap */
+   lpcr |= LPCR_PECE1;
+   mtspr(SPRN_LPCR, lpcr);
+   power7_nap();
+   }
+
+   return index;
+}
+
 /*
  * States for dedicated partition case.
  */
@@ -72,6 +111,13 @@ static struct cpuidle_state 
powernv_states[MAX_IDLE_STATE_COUNT] = {
.exit_latency = 10,
.target_residency = 100,
.enter = nap_loop },
+{ /* LongNap */
+   .name = LongNap,
+   .desc = LongNap,
+   .flags = CPUIDLE_FLAG_TIME_VALID,
+   .exit_latency = 10,
+   .target_residency = 100,
+   .enter = longnap_loop },
 };
 
 static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n,

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] ASoC: fsl: Set sdma peripheral type directly

2013-07-25 Thread Nicolin Chen
Let CPU DAI drivers set SDMA periperal type directly to support more
dma types(SPDIF, ESAI) other than only two for SSI.
This will easily allow some non-SSI drivers to use the imx-pcm-dma
as well.

Signed-off-by: Nicolin Chen b42...@freescale.com
---
@Timur
Compile-passed with mpc85xx_defconfig by using gcc-4.6.3-nolibc_powerpc-linux.
---
 sound/soc/fsl/fsl_ssi.c |4 ++--
 sound/soc/fsl/imx-pcm.h |7 ++-
 sound/soc/fsl/imx-ssi.c |4 ++--
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 11469fe..4d78df7 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -775,9 +775,9 @@ static int fsl_ssi_probe(struct platform_device *pdev)
fsl,spba-bus);
 
imx_pcm_dma_params_init_data(ssi_private-filter_data_tx,
-   dma_events[0], shared);
+   dma_events[0], shared ? IMX_DMATYPE_SSI_SP : 
IMX_DMATYPE_SSI);
imx_pcm_dma_params_init_data(ssi_private-filter_data_rx,
-   dma_events[1], shared);
+   dma_events[1], shared ? IMX_DMATYPE_SSI_SP : 
IMX_DMATYPE_SSI);
}
 
/* Initialize the the device_attribute structure */
diff --git a/sound/soc/fsl/imx-pcm.h b/sound/soc/fsl/imx-pcm.h
index fd56cad..9136625 100644
--- a/sound/soc/fsl/imx-pcm.h
+++ b/sound/soc/fsl/imx-pcm.h
@@ -22,14 +22,11 @@
 
 static inline void
 imx_pcm_dma_params_init_data(struct imx_dma_data *dma_data,
-   int dma, bool shared)
+   int dma, enum sdma_peripheral_type peripheral_type)
 {
dma_data-dma_request = dma;
dma_data-priority = DMA_PRIO_HIGH;
-   if (shared)
-   dma_data-peripheral_type = IMX_DMATYPE_SSI_SP;
-   else
-   dma_data-peripheral_type = IMX_DMATYPE_SSI;
+   dma_data-peripheral_type = peripheral_type;
 }
 
 struct imx_pcm_fiq_params {
diff --git a/sound/soc/fsl/imx-ssi.c b/sound/soc/fsl/imx-ssi.c
index f029e27..f58bcd8 100644
--- a/sound/soc/fsl/imx-ssi.c
+++ b/sound/soc/fsl/imx-ssi.c
@@ -571,13 +571,13 @@ static int imx_ssi_probe(struct platform_device *pdev)
res = platform_get_resource_byname(pdev, IORESOURCE_DMA, tx0);
if (res) {
imx_pcm_dma_params_init_data(ssi-filter_data_tx, res-start,
-   false);
+   IMX_DMATYPE_SSI);
}
 
res = platform_get_resource_byname(pdev, IORESOURCE_DMA, rx0);
if (res) {
imx_pcm_dma_params_init_data(ssi-filter_data_rx, res-start,
-   false);
+   IMX_DMATYPE_SSI);
}
 
platform_set_drvdata(pdev, ssi);
-- 
1.7.1


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/2] powerpc: allow kvm to use kerel debug framework

2013-07-25 Thread Alexander Graf

On 10.07.2013, at 09:25, Michael Neuling wrote:

 Alexander Graf ag...@suse.de wrote:
 
 
 On 09.07.2013, at 06:24, Michael Neuling wrote:
 
 Alexander Graf ag...@suse.de wrote:
 
 
 On 04.07.2013, at 08:15, Bharat Bhushan wrote:
 
 From: Bharat Bhushan bharat.bhus...@freescale.com
 
 This patchset moves the debug registers in a structure, which allows
 kvm to use same structure for debug emulation.
 
 Note: Earilier a patchset 
 https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-June/108132.html;
 was sent which is a bunch of six patches. That patchset is divided into 
 two parts:
   1) powerpc specific changes (These 2 patches are actually have those 
 changes)
   2) KVM specific changes (will send separate patch on agraf repository)
 
 Mikey, if you like those could you please apply the into a topic
 branch and get that one merged with Ben? I'd also pull it into my tree
 then.
 
 benh would pull these directly.  
 
 I'll have a chat with him to see if he wants my ACK before he does that.
 
 I have a bunch of patches that I need to apply on top, so I need a topic 
 branch.
 
 I've acked the PPC specific bits of the v6 version of these patches.
 
 benh said he'll open a topic branch for you sometime next week and he'll
 stick them in there.  

Ping? :)


Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/4 v6] powerpc: export debug registers save function for KVM

2013-07-25 Thread Alexander Graf

On 04.07.2013, at 08:57, Bharat Bhushan wrote:

 KVM need this function when switching from vcpu to user-space
 thread. My subsequent patch will use this function.
 
 Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com

Ben / Michael, please ack.


Alex

 ---
 v5-v6
 - switch_booke_debug_regs() not guarded by the compiler switch
 
 arch/powerpc/include/asm/switch_to.h |1 +
 arch/powerpc/kernel/process.c|3 ++-
 2 files changed, 3 insertions(+), 1 deletions(-)
 
 diff --git a/arch/powerpc/include/asm/switch_to.h 
 b/arch/powerpc/include/asm/switch_to.h
 index 200d763..db68f1d 100644
 --- a/arch/powerpc/include/asm/switch_to.h
 +++ b/arch/powerpc/include/asm/switch_to.h
 @@ -29,6 +29,7 @@ extern void giveup_vsx(struct task_struct *);
 extern void enable_kernel_spe(void);
 extern void giveup_spe(struct task_struct *);
 extern void load_up_spe(struct task_struct *);
 +extern void switch_booke_debug_regs(struct thread_struct *new_thread);
 
 #ifndef CONFIG_SMP
 extern void discard_lazy_cpu_state(void);
 diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
 index 01ff496..da586aa 100644
 --- a/arch/powerpc/kernel/process.c
 +++ b/arch/powerpc/kernel/process.c
 @@ -362,12 +362,13 @@ static void prime_debug_regs(struct thread_struct 
 *thread)
  * debug registers, set the debug registers from the values
  * stored in the new thread.
  */
 -static void switch_booke_debug_regs(struct thread_struct *new_thread)
 +void switch_booke_debug_regs(struct thread_struct *new_thread)
 {
   if ((current-thread.debug.dbcr0  DBCR0_IDM)
   || (new_thread-debug.dbcr0  DBCR0_IDM))
   prime_debug_regs(new_thread);
 }
 +EXPORT_SYMBOL_GPL(switch_booke_debug_regs);
 #else /* !CONFIG_PPC_ADV_DEBUG_REGS */
 #ifndef CONFIG_HAVE_HW_BREAKPOINT
 static void set_debug_reg_defaults(struct thread_struct *thread)
 -- 
 1.7.0.4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm-ppc in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-07-25 Thread Alexey Kardashevskiy
The current VFIO-on-POWER implementation supports only user mode
driven mapping, i.e. QEMU is sending requests to map/unmap pages.
However this approach is really slow, so we want to move that to KVM.
Since H_PUT_TCE can be extremely performance sensitive (especially with
network adapters where each packet needs to be mapped/unmapped) we chose
to implement that as a fast hypercall directly in real
mode (processor still in the guest context but MMU off).

To be able to do that, we need to provide some facilities to
access the struct page count within that real mode environment as things
like the sparsemem vmemmap mappings aren't accessible.

This adds an API to get page struct when MMU is off.

This adds to MM a new function put_page_unless_one() which drops a page
if counter is bigger than 1. It is going to be used when MMU is off
(real mode on PPC64 is the first user) and we want to make sure that page
release will not happen in real mode as it may crash the kernel in
a horrible way.

CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported.

Cc: linux...@kvack.org
Reviewed-by: Paul Mackerras pau...@samba.org
Signed-off-by: Paul Mackerras pau...@samba.org
Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru

---

Changes:
2013/07/25:
* removed realmode_put_page and added put_page_unless_one() instead.
The name has been chosen to conform the already existing get_page_unless_zero().
* removed realmode_get_page. Instead, get_page_unless_zero() will be used
* realmode_pfn_to_page fixed to return NULL for compound pages

2013/07/10:
* adjusted comment (removed sentence about virtual mode)
* get_page_unless_zero replaced with atomic_inc_not_zero to minimize
effect of a possible get_page_unless_zero() rework (if it ever happens).

2013/06/27:
* realmode_get_page() fixed to use get_page_unless_zero(). If failed,
the call will be passed from real to virtual mode and safely handled.
* added comment to PageCompound() in include/linux/page-flags.h.

2013/05/20:
* PageTail() is replaced by PageCompound() in order to have the same checks
for whether the page is huge in realmode_get_page() and realmode_put_page()

Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru
---
 arch/powerpc/include/asm/pgtable-ppc64.h |  2 ++
 arch/powerpc/mm/init_64.c| 54 +++-
 include/linux/mm.h   | 14 +
 include/linux/page-flags.h   |  4 ++-
 4 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index 46db094..4a191c4 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -394,6 +394,8 @@ static inline void mark_hpte_slot_valid(unsigned char 
*hpte_slot_array,
hpte_slot_array[index] = hidx  4 | 0x1  3;
 }
 
+struct page *realmode_pfn_to_page(unsigned long pfn);
+
 static inline char *get_hpte_slot_array(pmd_t *pmdp)
 {
/*
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index d0cd9e4..d2bb97c 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -300,5 +300,57 @@ void vmemmap_free(unsigned long start, unsigned long end)
 {
 }
 
-#endif /* CONFIG_SPARSEMEM_VMEMMAP */
+/*
+ * We do not have access to the sparsemem vmemmap, so we fallback to
+ * walking the list of sparsemem blocks which we already maintain for
+ * the sake of crashdump. In the long run, we might want to maintain
+ * a tree if performance of that linear walk becomes a problem.
+ *
+ * realmode_pfn_to_page functions can fail due to:
+ * 1) As real sparsemem blocks do not lay in RAM continously (they
+ * are in virtual address space which is not available in the real mode),
+ * the requested page struct can be split between blocks so get_page/put_page
+ * may fail.
+ * 2) When huge pages are used, the get_page/put_page API will fail
+ * in real mode as the linked addresses in the page struct are virtual
+ * too.
+ */
+struct page *realmode_pfn_to_page(unsigned long pfn)
+{
+   struct vmemmap_backing *vmem_back;
+   struct page *page;
+   unsigned long page_size = 1  mmu_psize_defs[mmu_vmemmap_psize].shift;
+   unsigned long pg_va = (unsigned long) pfn_to_page(pfn);
 
+   for (vmem_back = vmemmap_list; vmem_back; vmem_back = vmem_back-list) {
+   if (pg_va  vmem_back-virt_addr)
+   continue;
+
+   /* Check that page struct is not split between real pages */
+   if ((pg_va + sizeof(struct page)) 
+   (vmem_back-virt_addr + page_size))
+   return NULL;
+
+   page = (struct page *) (vmem_back-phys + pg_va -
+   vmem_back-virt_addr);
+
+   if (PageCompound(page))
+   return NULL;
+
+   return page;
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL_GPL(realmode_pfn_to_page);
+
+#elif 

Re: [PATCH] powerpc: Prepare to support kernel handling of IOMMU map/unmap

2013-07-25 Thread Alexey Kardashevskiy
On 07/25/2013 08:26 PM, Alexey Kardashevskiy wrote:
 The current VFIO-on-POWER implementation supports only user mode
 driven mapping, i.e. QEMU is sending requests to map/unmap pages.
 However this approach is really slow, so we want to move that to KVM.
 Since H_PUT_TCE can be extremely performance sensitive (especially with
 network adapters where each packet needs to be mapped/unmapped) we chose
 to implement that as a fast hypercall directly in real
 mode (processor still in the guest context but MMU off).
 
 To be able to do that, we need to provide some facilities to
 access the struct page count within that real mode environment as things
 like the sparsemem vmemmap mappings aren't accessible.
 
 This adds an API to get page struct when MMU is off.
 
 This adds to MM a new function put_page_unless_one() which drops a page
 if counter is bigger than 1. It is going to be used when MMU is off
 (real mode on PPC64 is the first user) and we want to make sure that page
 release will not happen in real mode as it may crash the kernel in
 a horrible way.


Yes, my english needs to be polished, I even know where :)


-- 
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


P1021rdb-pc

2013-07-25 Thread BHARATHI KANDIMALLA
Dear Sir,

We  are using  P1021rdb-pc   board  with p1021 processor.

1.  compilation and build process  is taking  so  much of time. How should I
reduce the time for linux  build process?

2.  In  kernel configuration  we are not able to select  P1021rdb., When we 
configured the  board for p1021RDB-PC ,  in  default kenel config  file   all 
the below platforms are selected , We want only  p1021rdb  , what should we 
select here?

 CONFIG_PPC_CELL is not set
CONFIG_FSL_SOC_BOOKE=y
CONFIG_FSL_85XX_CACHE_SRAM=y
CONFIG_MPC8540_ADS=y
CONFIG_MPC8560_ADS=y
CONFIG_MPC85xx_CDS=y
CONFIG_MPC85xx_MDS=y
CONFIG_MPC8536_DS=y
CONFIG_MPC85xx_DS=y
CONFIG_MPC85xx_RDB=y
# CONFIG_P1010_RDB is not set
CONFIG_P1022_DS=y
CONFIG_P1023_RDS=y
CONFIG_SOCRATES=y
CONFIG_KSI8560=y
CONFIG_XES_MPC85xx=y
CONFIG_STX_GP3=y
CONFIG_TQM8540=y
CONFIG_TQM8541=y
CONFIG_TQM8548=y
CONFIG_TQM8555=y
CONFIG_TQM8560=y
CONFIG_SBC8548=y

I am attaching  .config  file  for the reference


3. We are using  p1021 processor which has 36 bit support  , but we  have 
compiled uboot for 32 bit only.Is there any specfic use of 36 bit  compilation, 
?

only large amount of memory size 0f 64G, any way we are using only 512Mbyte of  
DDR, 128MBYte  NOR FLASH



4. Now we are using  SDK 1.3.2  for  p1021 procesor .Is there any specific
necessity that we should switch  to sdk1.4
regarding linux drivers  specially included for  p1021 procesor?

5.UMCC driver  is  available in  linux driver ? where should I get some help 
regarding UMCC ?


regards
Bharathi kandimalla


P.S. - Please Note our New Corporate office Address:-

ICOMM House,


Plot No.31, Kamalapuri,


Srinagar Colony, Banjara Hills,


Hyderabad - 500073, INDIA.


Disclaimer

This email is confidential. If you are not the addressee tell the sender 
immediately and destroy this email without using, sending or storing it. E-mail 
transmission cannot be secure as information could be intercepted, corrupted, 
lost, destroyed, delay, interception, amendment or incomplete, or contain 
viruses. ICOMM TELE LTD and subsidiaries (ICOMM) do not accept liability for 
damage caused by this email and may monitor email traffic. Unless expressly 
stated, any opinions are the sender's and are not approved by ICOMM and this 
email is not an offer, solicitation, recommendation or agreement of any kind. 
If verification is required please request a hard-copy version. ICOMM's 
corporate office is ICOMM House, Plot No.31, Kamalapuri, Srinagar Colony, 
Banjara Hills, Hyderabad - 500073 INDIA. Ph. No. : +91-40-2355-. Fax. No. : 
+91-40-2355-2266.

Thank You


ICOMMhttp://www.icommtele.com

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] powerpc: Update compilation flags with core specific options

2013-07-25 Thread Catalin Udma
If CONFIG_E500 is enabled, the compilation flags are updated
specifying the target core -mcpu=e5500/e500mc/8540
Also remove -Wa,-me500, being incompatible with -mcpu=e5500/e6500
The assembler option is redundant if the -mcpu= flag is set.
The patch fixes the kernel compilation problem for e5500/e6500
when using gcc option -mcpu=e5500/e6500.

Signed-off-by: Catalin Udma catalin.u...@freescale.com
---
changes for v2: 
- update also KBUILD_AFLAGS with -mcpu and -msoft-float flags
  
 arch/powerpc/Makefile |   16 +++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 0624909..cb5cbe2 100644
--- a/arch/powerpc/Makefile
+++ b/arch/powerpc/Makefile
@@ -140,6 +140,21 @@ ifeq ($(CONFIG_6xx),y)
 KBUILD_CFLAGS  += -mcpu=powerpc
 endif
 
+ifeq ($(CONFIG_E500),y)
+ifeq ($(CONFIG_64BIT),y)
+KBUILD_CFLAGS  += -mcpu=e5500
+KBUILD_AFLAGS  += -mcpu=e5500 -msoft-float
+else
+ifeq ($(CONFIG_PPC_E500MC),y)
+KBUILD_CFLAGS  += -mcpu=e500mc
+KBUILD_AFLAGS  += -mcpu=e500mc -msoft-float
+else
+KBUILD_CFLAGS  += -mcpu=8540
+KBUILD_AFLAGS  += -mcpu=8540 -msoft-float
+endif
+endif
+endif
+
 # Work around a gcc code-gen bug with -fno-omit-frame-pointer.
 ifeq ($(CONFIG_FUNCTION_TRACER),y)
 KBUILD_CFLAGS  += -mno-sched-epilog
@@ -147,7 +162,6 @@ endif
 
 cpu-as-$(CONFIG_4xx)   += -Wa,-m405
 cpu-as-$(CONFIG_ALTIVEC)   += -Wa,-maltivec
-cpu-as-$(CONFIG_E500)  += -Wa,-me500
 cpu-as-$(CONFIG_E200)  += -Wa,-me200
 
 KBUILD_AFLAGS += $(cpu-as-y)
-- 
1.7.8


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 1/3] powerpc/mpc85xx: remove the unneeded pci init functions for corenet ds board

2013-07-25 Thread Kevin Hao
On Tue, Jul 23, 2013 at 05:31:16PM -0500, Scott Wood wrote:
 On 06/06/2013 09:00:20 PM, Kevin Hao wrote:
   Vector table: BAR=1 offset=

snip

   PBA: BAR=1 offset=0800
 
 
 As you can see, the only difference between these two logs is the
 io resource address and all the mem and bus address are still the
 same.
 
 I dug a bit deeper into this, and it's making by head hurt.  It
 seems that we're now getting saved by the host bridge (that for some
 reason has the class code of a PCI-to-PCI bridge rather than a host
 bridge) having I/O space of 0x1000 bytes[1], which gets allocated at
 zero.  There have been some changes in the QEMU PCI code since I saw
 the problem, including changing the class code of the bridge, so
 that's probably why it sort-of works now.

We are not just lucky here. Even without the change of the qemu pci
code we still can get the correct IO address in the current kernel.
The following is the log by running the latest kernel on the qemu which
doesn't have the PCI-to-PCI bridge change yet. As you can see we don't
pick a primary pci bus in this case and the Ethernet controller still get
the I/O ports at 1000.
PCI: Probing PCI hardware
fsl-pci e0008000.pci: PCI host bridge to bus :00
pci_bus :00: root bus resource [io  0xf102-0xf102] (bus address 
[0x-0x])
pci_bus :00: root bus resource [mem 0xc000-0xdfff]
pci_bus :00: root bus resource [bus 00-ff]
pci_bus :00: busn_res: [bus 00-ff] end is updated to ff
pci :00:00.0: [1957:0030] type 00 class 0x0b2000
pci :00:11.0: [1af4:1000] type 00 class 0x02
pci :00:11.0: reg 0x10: [io  0xf102-0xf102001f]
pci_bus :00: busn_res: [bus 00-ff] end is updated to 00
pci :00:11.0: BAR 0: assigned [io  0xf1021000-0xf102101f]
pci_bus :00: resource 4 [io  0xf102-0xf102]
pci_bus :00: resource 5 [mem 0xc000-0xdfff]

root@localhost:~# lspci -v
00:00.0 Power PC: Freescale Semiconductor Inc MPC8533E
Subsystem: Red Hat, Inc Device 1100
Flags: bus master, fast devsel, latency 0

00:11.0 Ethernet controller: Red Hat, Inc Virtio network device
Subsystem: Red Hat, Inc Device 0001
Flags: fast devsel, IRQ 16
I/O ports at 1000 [disabled] [size=32]

The reason is that the ppc kernel assume that the BARs starting
at 0 is unset and will reassign it later. There is a bug in the previous
kernel, so the kernel maybe not work well for qemu in this case. But I
think this has been fixed by the commit c5df457f (powerpc/pci: Check the
bus address instead of resource address in pcibios_fixup_resources).

 
 What QEMU is doing does not match what real hardware does, though.
 At least on mpc8536 which is similar to mpc8544 (I wasn't able to
 quickly get access to a working mpc8544 to test on), the PCI bridge
 has class code Processor, rather than bridge of any sort.  Thus, on
 real hardware we would not get the 0x1000 reservation.

This doesn't matter. We can always make sure to reassign these resources
starting at 0. I also done a test on the mpc8536ds board. Both the pci
and pcie devices work pretty well without picking a primary pci bus
on this board. The following is the log.

Found FSL PCI host bridge at 0xffe08000. Firmware bus number: 0-0
PCI host bridge /pci@ffe08000  ranges:
 MEM 0x8000..0x8fff - 0x8000 
  IO 0xffc0..0xffc0 - 0x
/pci@ffe08000: PCICSRBAR @ 0xfff0
Found FSL PCI host bridge at 0xffe09000. Firmware bus number: 0-0
PCI host bridge /pcie@ffe09000  ranges:
 MEM 0x9800..0x9fff - 0x9800 
  IO 0xffc2..0xffc2 - 0x
/pcie@ffe09000: PCICSRBAR @ 0xfff0
Found FSL PCI host bridge at 0xffe0a000. Firmware bus number: 0-1
PCI host bridge /pcie@ffe0a000  ranges:
 MEM 0x9000..0x97ff - 0x9000 
  IO 0xffc1..0xffc1 - 0x
/pcie@ffe0a000: PCICSRBAR @ 0xfff0
Found FSL PCI host bridge at 0xffe0b000. Firmware bus number: 0-0
PCI host bridge /pcie@ffe0b000  ranges:
 MEM 0xa000..0xbfff - 0xa000 
  IO 0xffc3..0xffc3 - 0x
/pcie@ffe0b000: PCICSRBAR @ 0xfff0
PCI: Probing PCI hardware
fsl-pci ffe08000.pci: PCI host bridge to bus :00
pci_bus :00: root bus resource [io  0xe102-0xe102] (bus address 
[0x-0x])
pci_bus :00: root bus resource [mem 0x8000-0x8fff]
pci_bus :00: root bus resource [bus 00-ff]
pci_bus :00: busn_res: [bus 00-ff] end is updated to ff
pci :00:00.0: [1957:0050] type 00 class 0x0b2000
pci :00:00.0: reg 0x10: [mem 

Re: [PATCH] module: ppc64 module CRC relocation fix causes perf issues

2013-07-25 Thread Neil Horman
On Thu, Jul 25, 2013 at 09:14:25AM +1000, Benjamin Herrenschmidt wrote:
 On Thu, 2013-07-25 at 08:34 +1000, Anton Blanchard wrote:
   Apart from the annoying colors, is there anything specific I should
   be looking for?  Some sort of error message, or output that actually
   makes sense?
  
  Thanks for testing! Ben, I think the patch is good to go.
 
 Sent it yesterday to Linus, it's upstream already :-)
 
 Cheers,
 Ben.
 
Sorry I'm a bit late to the thread, I've ben swamped.  Has someone tested this
with kexec/kdump?  Thats why the origional patch was created, because when kexec
loads the kernel at a different physical address, the relocations messed with
the module crc's, and modules couldn't load during the kexec boot.  Assuming
that kernaddr_start gets set appropriately during boot, using PHYSICAL_START
should be fine, but I wanted to check, and don't currently have access to a
powerpc system to do so.
Neil

 
 
 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Frederic Weisbecker
On Thu, Jul 25, 2013 at 02:33:02PM +0530, Preeti U Murthy wrote:
 In the current design of timer offload framework, the broadcast cpu should
 *not* go into tickless idle so as to avoid missed wakeups on CPUs in deep 
 idle states.
 
 Since we prevent the CPUs entering deep idle states from programming the 
 lapic of the
 broadcast cpu for their respective next local events for reasons mentioned in
 PATCH[3/5], the broadcast CPU checks if there are any CPUs to be woken up 
 during
 each of its timer interrupt programmed to its local events.
 
 With tickless idle, the broadcast CPU might not get a timer interrupt till 
 after
 many ticks which can result in missed wakeups on CPUs in deep idle states. By
 disabling tickless idle, worst case, the tick_sched hrtimer will trigger a
 timer interrupt every period to check for broadcast.
 
 However the current setup of tickless idle does not let us make the choice
 of tickless on individual cpus. NOHZ_MODE_INACTIVE which disables tickless 
 idle,
 is a system wide setting. Hence resort to an arch specific call to check if a 
 cpu
 can go into tickless idle.

Hi Preeti,

I'm not exactly sure why you can't enter the broadcast CPU in dynticks idle 
mode.
I read in the previous patch that's because in dynticks idle mode the broadcast
CPU deactivates its lapic so it doesn't receive the IPI. But may be I 
misunderstood.
Anyway that's not good for powersaving.

Also when an arch wants to prevent a CPU from entering dynticks idle mode, it 
typically
use arch_needs_cpu(). May be that could fit for you as well?

Thanks.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] ASoC: fsl: Set sdma peripheral type directly

2013-07-25 Thread Shawn Guo
On Thu, Jul 25, 2013 at 05:41:41PM +0800, Nicolin Chen wrote:
 Let CPU DAI drivers set SDMA periperal type directly to support more
 dma types(SPDIF, ESAI) other than only two for SSI.
 This will easily allow some non-SSI drivers to use the imx-pcm-dma
 as well.
 
 Signed-off-by: Nicolin Chen b42...@freescale.com

Acked-by: Shawn Guo shawn@linaro.org

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-25 Thread Gleb Natapov
On Thu, Jul 25, 2013 at 06:07:55PM +0200, Alexander Graf wrote:
 
 On 25.07.2013, at 10:50, Gleb Natapov wrote:
 
  On Wed, Jul 24, 2013 at 03:32:49PM -0500, Scott Wood wrote:
  On 07/24/2013 04:39:59 AM, Alexander Graf wrote:
  
  On 24.07.2013, at 11:35, Gleb Natapov wrote:
  
  On Wed, Jul 24, 2013 at 11:21:11AM +0200, Alexander Graf wrote:
  Are not we going to use page_is_ram() from
  e500_shadow_mas2_attrib() as Scott commented?
  
  rWhy aren't we using page_is_ram() in kvm_is_mmio_pfn()?
  
  
  Because it is much slower and, IIRC, actually used to build pfn
  map that allow
  us to check quickly for valid pfn.
  
  Then why should we use page_is_ram()? :)
  
  I really don't want the e500 code to diverge too much from what
  the rest of the kvm code is doing.
  
  I don't understand actually used to build pfn map  What code
  is this?  I don't see any calls to page_is_ram() in the KVM code, or
  in generic mm code.  Is this a statement about what x86 does?
  It may be not page_is_ram() directly, but the same into page_is_ram() is
  using. On power both page_is_ram() and do_init_bootmem() walks some kind
  of memblock_region data structure. What important is that pfn_valid()
  does not mean that there is a memory behind page structure. See Andrea's
  reply.
  
  
  On PPC page_is_ram() is only called (AFAICT) for determining what
  attributes to set on mmaps.  We want to be sure that KVM always
  makes the same decision.  While pfn_valid() seems like it should be
  equivalent, it's not obvious from the PPC code that it is.
  
  Again pfn_valid() is not enough.
  
  If pfn_valid() is better, why is that not used for mmap?  Why are
  there two different names for the same thing?
  
  They are not the same thing. page_is_ram() tells you if phys address is
  ram backed. pfn_valid() tells you if there is struct page behind the
  pfn. PageReserved() tells if you a pfn is marked as reserved. All non
  ram pfns should be reserved, but ram pfns can be reserved too. Again,
  see Andrea's reply.
  
  Why ppc uses page_is_ram() for mmap? How should I know? But looking at
 
 That one's easy. Let's just ask Ben. Ben, is there any particular reason PPC 
 uses page_is_ram() rather than what KVM does here to figure out whether a pfn 
 is RAM or not? It would be really useful to be able to run the exact same 
 logic that figures out whether we're cacheable or not in both TLB writers 
 (KVM and linux-mm).
 
KVM does not only try to figure out what is RAM or not! Look at how KVM
uses the function. KVM tries to figure out if refcounting needed to be
used on this page among other things.

--
Gleb.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages

2013-07-25 Thread Alexander Graf

On 25.07.2013, at 10:50, Gleb Natapov wrote:

 On Wed, Jul 24, 2013 at 03:32:49PM -0500, Scott Wood wrote:
 On 07/24/2013 04:39:59 AM, Alexander Graf wrote:
 
 On 24.07.2013, at 11:35, Gleb Natapov wrote:
 
 On Wed, Jul 24, 2013 at 11:21:11AM +0200, Alexander Graf wrote:
 Are not we going to use page_is_ram() from
 e500_shadow_mas2_attrib() as Scott commented?
 
 rWhy aren't we using page_is_ram() in kvm_is_mmio_pfn()?
 
 
 Because it is much slower and, IIRC, actually used to build pfn
 map that allow
 us to check quickly for valid pfn.
 
 Then why should we use page_is_ram()? :)
 
 I really don't want the e500 code to diverge too much from what
 the rest of the kvm code is doing.
 
 I don't understand actually used to build pfn map  What code
 is this?  I don't see any calls to page_is_ram() in the KVM code, or
 in generic mm code.  Is this a statement about what x86 does?
 It may be not page_is_ram() directly, but the same into page_is_ram() is
 using. On power both page_is_ram() and do_init_bootmem() walks some kind
 of memblock_region data structure. What important is that pfn_valid()
 does not mean that there is a memory behind page structure. See Andrea's
 reply.
 
 
 On PPC page_is_ram() is only called (AFAICT) for determining what
 attributes to set on mmaps.  We want to be sure that KVM always
 makes the same decision.  While pfn_valid() seems like it should be
 equivalent, it's not obvious from the PPC code that it is.
 
 Again pfn_valid() is not enough.
 
 If pfn_valid() is better, why is that not used for mmap?  Why are
 there two different names for the same thing?
 
 They are not the same thing. page_is_ram() tells you if phys address is
 ram backed. pfn_valid() tells you if there is struct page behind the
 pfn. PageReserved() tells if you a pfn is marked as reserved. All non
 ram pfns should be reserved, but ram pfns can be reserved too. Again,
 see Andrea's reply.
 
 Why ppc uses page_is_ram() for mmap? How should I know? But looking at

That one's easy. Let's just ask Ben. Ben, is there any particular reason PPC 
uses page_is_ram() rather than what KVM does here to figure out whether a pfn 
is RAM or not? It would be really useful to be able to run the exact same logic 
that figures out whether we're cacheable or not in both TLB writers (KVM and 
linux-mm).


Alex

 the function it does it only as a fallback if
 ppc_md.phys_mem_access_prot() is not provided. Making access to MMIO
 noncached as a safe fallback makes sense. It is also make sense to allow
 noncached access to reserved ram sometimes.
 
 --
   Gleb.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 0/2] PCI: Convert hotplug core and pciehp to be builtin only

2013-07-25 Thread Bjorn Helgaas
Yinghai is working on a regression fix (PCI: Separate stop and remove
devices in pciehp) that needs to go into v3.11, and his fix will be
simpler if we remove the module option for pciehp in v3.11.  That will mean
he won't have to export pci_stop_bus_device() and pci_remove_bus_device()
for use by modules.

So these two patches convert CONFIG_HOTPLUG_PCI and CONFIG_HOTPLUG_PCI_PCIE
to be bool (not tristate) and update defconfig files that had
CONFIG_HOTPLUG_PCI=m.

The motivation was for CONFIG_HOTPLUG_PCI_PCIE, but I also converted
CONFIG_HOTPLUG_PCI to bool because the CONFIG_HOTPLUG_PCI=m and
CONFIG_HOTPLUG_PCI_PCIE=y combination was accepted by Kconfig and builds a
kernel, but pciehp doesn't actually work in that case (pointed out by
Yinghai, thanks!)

These are intended for v3.11.

---

Bjorn Helgaas (2):
  PCI: hotplug: Convert to be builtin only, not modular
  PCI: pciehp: Convert pciehp to be builtin only, not modular


 arch/ia64/configs/generic_defconfig|2 +-
 arch/ia64/configs/gensparse_defconfig  |2 +-
 arch/ia64/configs/tiger_defconfig  |2 +-
 arch/ia64/configs/xen_domu_defconfig   |2 +-
 arch/powerpc/configs/ppc64_defconfig   |2 +-
 arch/powerpc/configs/ppc64e_defconfig  |2 +-
 arch/powerpc/configs/pseries_defconfig |2 +-
 arch/sh/configs/sh03_defconfig |2 +-
 drivers/pci/hotplug/Kconfig|5 +
 drivers/pci/pcie/Kconfig   |5 +
 10 files changed, 10 insertions(+), 16 deletions(-)
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/2] PCI: hotplug: Convert to be builtin only, not modular

2013-07-25 Thread Bjorn Helgaas
Convert CONFIG_HOTPLUG_PCI from tristate to bool.  This only affects
the hotplug core; several of the hotplug drivers can still be modules.

Signed-off-by: Bjorn Helgaas bhelg...@google.com
---
 arch/ia64/configs/generic_defconfig|2 +-
 arch/ia64/configs/gensparse_defconfig  |2 +-
 arch/ia64/configs/tiger_defconfig  |2 +-
 arch/ia64/configs/xen_domu_defconfig   |2 +-
 arch/powerpc/configs/ppc64_defconfig   |2 +-
 arch/powerpc/configs/ppc64e_defconfig  |2 +-
 arch/powerpc/configs/pseries_defconfig |2 +-
 arch/sh/configs/sh03_defconfig |2 +-
 drivers/pci/hotplug/Kconfig|5 +
 9 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/arch/ia64/configs/generic_defconfig 
b/arch/ia64/configs/generic_defconfig
index 7913695..efbd292 100644
--- a/arch/ia64/configs/generic_defconfig
+++ b/arch/ia64/configs/generic_defconfig
@@ -31,7 +31,7 @@ CONFIG_ACPI_FAN=m
 CONFIG_ACPI_DOCK=y
 CONFIG_ACPI_PROCESSOR=m
 CONFIG_ACPI_CONTAINER=m
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_ACPI=m
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/ia64/configs/gensparse_defconfig 
b/arch/ia64/configs/gensparse_defconfig
index f8e9133..f64980d 100644
--- a/arch/ia64/configs/gensparse_defconfig
+++ b/arch/ia64/configs/gensparse_defconfig
@@ -25,7 +25,7 @@ CONFIG_ACPI_BUTTON=m
 CONFIG_ACPI_FAN=m
 CONFIG_ACPI_PROCESSOR=m
 CONFIG_ACPI_CONTAINER=m
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_ACPI=m
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/ia64/configs/tiger_defconfig 
b/arch/ia64/configs/tiger_defconfig
index a5a9e02..0f4e9e4 100644
--- a/arch/ia64/configs/tiger_defconfig
+++ b/arch/ia64/configs/tiger_defconfig
@@ -31,7 +31,7 @@ CONFIG_ACPI_BUTTON=m
 CONFIG_ACPI_FAN=m
 CONFIG_ACPI_PROCESSOR=m
 CONFIG_ACPI_CONTAINER=m
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_ACPI=m
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/ia64/configs/xen_domu_defconfig 
b/arch/ia64/configs/xen_domu_defconfig
index 37b9b42..b025acf 100644
--- a/arch/ia64/configs/xen_domu_defconfig
+++ b/arch/ia64/configs/xen_domu_defconfig
@@ -32,7 +32,7 @@ CONFIG_ACPI_BUTTON=m
 CONFIG_ACPI_FAN=m
 CONFIG_ACPI_PROCESSOR=m
 CONFIG_ACPI_CONTAINER=m
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_ACPI=m
 CONFIG_PACKET=y
 CONFIG_UNIX=y
diff --git a/arch/powerpc/configs/ppc64_defconfig 
b/arch/powerpc/configs/ppc64_defconfig
index c86fcb9..0e8cfd0 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -58,7 +58,7 @@ CONFIG_SCHED_SMT=y
 CONFIG_PPC_DENORMALISATION=y
 CONFIG_PCCARD=y
 CONFIG_ELECTRA_CF=y
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_RPA=m
 CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
 CONFIG_PACKET=y
diff --git a/arch/powerpc/configs/ppc64e_defconfig 
b/arch/powerpc/configs/ppc64e_defconfig
index 4b20f76..0085dc4 100644
--- a/arch/powerpc/configs/ppc64e_defconfig
+++ b/arch/powerpc/configs/ppc64e_defconfig
@@ -32,7 +32,7 @@ CONFIG_IRQ_ALL_CPUS=y
 CONFIG_SPARSEMEM_MANUAL=y
 CONFIG_PCI_MSI=y
 CONFIG_PCCARD=y
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_PACKET=y
 CONFIG_UNIX=y
 CONFIG_XFRM_USER=m
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index bea8587..1d4b976 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -53,7 +53,7 @@ CONFIG_PPC_64K_PAGES=y
 CONFIG_PPC_SUBPAGE_PROT=y
 CONFIG_SCHED_SMT=y
 CONFIG_PPC_DENORMALISATION=y
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_RPA=m
 CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
 CONFIG_PACKET=y
diff --git a/arch/sh/configs/sh03_defconfig b/arch/sh/configs/sh03_defconfig
index 2051821..0cf4097 100644
--- a/arch/sh/configs/sh03_defconfig
+++ b/arch/sh/configs/sh03_defconfig
@@ -22,7 +22,7 @@ CONFIG_PREEMPT=y
 CONFIG_CMDLINE_OVERWRITE=y
 CONFIG_CMDLINE=console=ttySC1,115200 mem=64M root=/dev/nfs
 CONFIG_PCI=y
-CONFIG_HOTPLUG_PCI=m
+CONFIG_HOTPLUG_PCI=y
 CONFIG_BINFMT_MISC=y
 CONFIG_NET=y
 CONFIG_PACKET=y
diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
index bb7ebb2..d85009d 100644
--- a/drivers/pci/hotplug/Kconfig
+++ b/drivers/pci/hotplug/Kconfig
@@ -3,16 +3,13 @@
 #
 
 menuconfig HOTPLUG_PCI
-   tristate Support for PCI Hotplug
+   bool Support for PCI Hotplug
depends on PCI  SYSFS
---help---
  Say Y here if you have a motherboard with a PCI Hotplug controller.
  This allows you to add and remove PCI cards while the machine is
  powered up and running.
 
- To compile this driver as a module, choose M here: the
- module will be called pci_hotplug.
-
  When in doubt, say N.
 
 if HOTPLUG_PCI

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/2] PCI: pciehp: Convert pciehp to be builtin only, not modular

2013-07-25 Thread Bjorn Helgaas
Convert pciehp to be builtin only, with no module option.

Signed-off-by: Bjorn Helgaas bhelg...@google.com
Acked-by: Rafael J. Wysocki rafael.j.wyso...@intel.com
---
 drivers/pci/pcie/Kconfig |5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/pci/pcie/Kconfig b/drivers/pci/pcie/Kconfig
index 569f82f..3b94cfc 100644
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -14,15 +14,12 @@ config PCIEPORTBUS
 # Include service Kconfig here
 #
 config HOTPLUG_PCI_PCIE
-   tristate PCI Express Hotplug driver
+   bool PCI Express Hotplug driver
depends on HOTPLUG_PCI  PCIEPORTBUS
help
  Say Y here if you have a motherboard that supports PCI Express Native
  Hotplug
 
- To compile this driver as a module, choose M here: the
- module will be called pciehp.
-
  When in doubt, say N.
 
 source drivers/pci/pcie/aer/Kconfig

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH] ASoC: fsl: Set sdma peripheral type directly

2013-07-25 Thread Mark Brown
On Thu, Jul 25, 2013 at 05:41:41PM +0800, Nicolin Chen wrote:
 Let CPU DAI drivers set SDMA periperal type directly to support more
 dma types(SPDIF, ESAI) other than only two for SSI.
 This will easily allow some non-SSI drivers to use the imx-pcm-dma
 as well.

Applied, thanks.


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] PCI: hotplug: Convert to be builtin only, not modular

2013-07-25 Thread Rafael J. Wysocki
On Thursday, July 25, 2013 11:57:20 AM Bjorn Helgaas wrote:
 Convert CONFIG_HOTPLUG_PCI from tristate to bool.  This only affects
 the hotplug core; several of the hotplug drivers can still be modules.
 
 Signed-off-by: Bjorn Helgaas bhelg...@google.com

Acked-by: Rafael J. Wysocki rafael.j.wyso...@intel.com

 ---
  arch/ia64/configs/generic_defconfig|2 +-
  arch/ia64/configs/gensparse_defconfig  |2 +-
  arch/ia64/configs/tiger_defconfig  |2 +-
  arch/ia64/configs/xen_domu_defconfig   |2 +-
  arch/powerpc/configs/ppc64_defconfig   |2 +-
  arch/powerpc/configs/ppc64e_defconfig  |2 +-
  arch/powerpc/configs/pseries_defconfig |2 +-
  arch/sh/configs/sh03_defconfig |2 +-
  drivers/pci/hotplug/Kconfig|5 +
  9 files changed, 9 insertions(+), 12 deletions(-)
 
 diff --git a/arch/ia64/configs/generic_defconfig 
 b/arch/ia64/configs/generic_defconfig
 index 7913695..efbd292 100644
 --- a/arch/ia64/configs/generic_defconfig
 +++ b/arch/ia64/configs/generic_defconfig
 @@ -31,7 +31,7 @@ CONFIG_ACPI_FAN=m
  CONFIG_ACPI_DOCK=y
  CONFIG_ACPI_PROCESSOR=m
  CONFIG_ACPI_CONTAINER=m
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_HOTPLUG_PCI_ACPI=m
  CONFIG_PACKET=y
  CONFIG_UNIX=y
 diff --git a/arch/ia64/configs/gensparse_defconfig 
 b/arch/ia64/configs/gensparse_defconfig
 index f8e9133..f64980d 100644
 --- a/arch/ia64/configs/gensparse_defconfig
 +++ b/arch/ia64/configs/gensparse_defconfig
 @@ -25,7 +25,7 @@ CONFIG_ACPI_BUTTON=m
  CONFIG_ACPI_FAN=m
  CONFIG_ACPI_PROCESSOR=m
  CONFIG_ACPI_CONTAINER=m
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_HOTPLUG_PCI_ACPI=m
  CONFIG_PACKET=y
  CONFIG_UNIX=y
 diff --git a/arch/ia64/configs/tiger_defconfig 
 b/arch/ia64/configs/tiger_defconfig
 index a5a9e02..0f4e9e4 100644
 --- a/arch/ia64/configs/tiger_defconfig
 +++ b/arch/ia64/configs/tiger_defconfig
 @@ -31,7 +31,7 @@ CONFIG_ACPI_BUTTON=m
  CONFIG_ACPI_FAN=m
  CONFIG_ACPI_PROCESSOR=m
  CONFIG_ACPI_CONTAINER=m
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_HOTPLUG_PCI_ACPI=m
  CONFIG_PACKET=y
  CONFIG_UNIX=y
 diff --git a/arch/ia64/configs/xen_domu_defconfig 
 b/arch/ia64/configs/xen_domu_defconfig
 index 37b9b42..b025acf 100644
 --- a/arch/ia64/configs/xen_domu_defconfig
 +++ b/arch/ia64/configs/xen_domu_defconfig
 @@ -32,7 +32,7 @@ CONFIG_ACPI_BUTTON=m
  CONFIG_ACPI_FAN=m
  CONFIG_ACPI_PROCESSOR=m
  CONFIG_ACPI_CONTAINER=m
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_HOTPLUG_PCI_ACPI=m
  CONFIG_PACKET=y
  CONFIG_UNIX=y
 diff --git a/arch/powerpc/configs/ppc64_defconfig 
 b/arch/powerpc/configs/ppc64_defconfig
 index c86fcb9..0e8cfd0 100644
 --- a/arch/powerpc/configs/ppc64_defconfig
 +++ b/arch/powerpc/configs/ppc64_defconfig
 @@ -58,7 +58,7 @@ CONFIG_SCHED_SMT=y
  CONFIG_PPC_DENORMALISATION=y
  CONFIG_PCCARD=y
  CONFIG_ELECTRA_CF=y
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_HOTPLUG_PCI_RPA=m
  CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
  CONFIG_PACKET=y
 diff --git a/arch/powerpc/configs/ppc64e_defconfig 
 b/arch/powerpc/configs/ppc64e_defconfig
 index 4b20f76..0085dc4 100644
 --- a/arch/powerpc/configs/ppc64e_defconfig
 +++ b/arch/powerpc/configs/ppc64e_defconfig
 @@ -32,7 +32,7 @@ CONFIG_IRQ_ALL_CPUS=y
  CONFIG_SPARSEMEM_MANUAL=y
  CONFIG_PCI_MSI=y
  CONFIG_PCCARD=y
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_PACKET=y
  CONFIG_UNIX=y
  CONFIG_XFRM_USER=m
 diff --git a/arch/powerpc/configs/pseries_defconfig 
 b/arch/powerpc/configs/pseries_defconfig
 index bea8587..1d4b976 100644
 --- a/arch/powerpc/configs/pseries_defconfig
 +++ b/arch/powerpc/configs/pseries_defconfig
 @@ -53,7 +53,7 @@ CONFIG_PPC_64K_PAGES=y
  CONFIG_PPC_SUBPAGE_PROT=y
  CONFIG_SCHED_SMT=y
  CONFIG_PPC_DENORMALISATION=y
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_HOTPLUG_PCI_RPA=m
  CONFIG_HOTPLUG_PCI_RPA_DLPAR=m
  CONFIG_PACKET=y
 diff --git a/arch/sh/configs/sh03_defconfig b/arch/sh/configs/sh03_defconfig
 index 2051821..0cf4097 100644
 --- a/arch/sh/configs/sh03_defconfig
 +++ b/arch/sh/configs/sh03_defconfig
 @@ -22,7 +22,7 @@ CONFIG_PREEMPT=y
  CONFIG_CMDLINE_OVERWRITE=y
  CONFIG_CMDLINE=console=ttySC1,115200 mem=64M root=/dev/nfs
  CONFIG_PCI=y
 -CONFIG_HOTPLUG_PCI=m
 +CONFIG_HOTPLUG_PCI=y
  CONFIG_BINFMT_MISC=y
  CONFIG_NET=y
  CONFIG_PACKET=y
 diff --git a/drivers/pci/hotplug/Kconfig b/drivers/pci/hotplug/Kconfig
 index bb7ebb2..d85009d 100644
 --- a/drivers/pci/hotplug/Kconfig
 +++ b/drivers/pci/hotplug/Kconfig
 @@ -3,16 +3,13 @@
  #
  
  menuconfig HOTPLUG_PCI
 - tristate Support for PCI Hotplug
 + bool Support for PCI Hotplug
   depends on PCI  SYSFS
   ---help---
 Say Y here if you have a motherboard with a PCI Hotplug controller.
 This allows you to add and remove PCI cards while the machine is
 powered up and running.
  
 -   To compile this driver as a module, choose M here: the
 -   module will be called pci_hotplug.
 -
 When in doubt, say N.
  
  if HOTPLUG_PCI
 

Re: [PATCHv5 02/11] PCI: use weak functions for MSI arch-specific functions

2013-07-25 Thread Bjorn Helgaas
On Mon, Jul 15, 2013 at 5:52 AM, Thomas Petazzoni
thomas.petazz...@free-electrons.com wrote:
 Until now, the MSI architecture-specific functions could be overloaded
 using a fairly complex set of #define and compile-time
 conditionals. In order to prepare for the introduction of the msi_chip
 infrastructure, it is desirable to switch all those functions to use
 the 'weak' mechanism. This commit converts all the architectures that
 were overidding those MSI functions to use the new strategy.

 Note that we keep a separate, non-weak, function
 default_teardown_msi_irqs() for the default behavior of the
 arch_teardown_msi_irqs(), as the default behavior is needed by the Xen
 x86 PCI code.

 Signed-off-by: Thomas Petazzoni thomas.petazz...@free-electrons.com
 Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 Cc: Paul Mackerras pau...@samba.org
 Cc: linuxppc-dev@lists.ozlabs.org
 Cc: Martin Schwidefsky schwidef...@de.ibm.com
 Cc: Heiko Carstens heiko.carst...@de.ibm.com
 Cc: linux...@de.ibm.com
 Cc: linux-s...@vger.kernel.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Ingo Molnar mi...@redhat.com
 Cc: H. Peter Anvin h...@zytor.com
 Cc: x...@kernel.org
 Cc: Russell King li...@arm.linux.org.uk
 Cc: Tony Luck tony.l...@intel.com
 Cc: Fenghua Yu fenghua...@intel.com
 Cc: linux-i...@vger.kernel.org
 Cc: Ralf Baechle r...@linux-mips.org
 Cc: linux-m...@linux-mips.org
 Cc: David S. Miller da...@davemloft.net
 Cc: sparcli...@vger.kernel.org
 Cc: Chris Metcalf cmetc...@tilera.com

Acked-by: Bjorn Helgaas bhelg...@google.com

 ---
  arch/mips/include/asm/pci.h|  5 -
  arch/powerpc/include/asm/pci.h |  5 -
  arch/s390/include/asm/pci.h|  4 
  arch/x86/include/asm/pci.h | 28 --
  arch/x86/kernel/x86_init.c | 21 
  drivers/pci/msi.c  | 45 
 +++---
  include/linux/msi.h|  7 ++-
  7 files changed, 47 insertions(+), 68 deletions(-)

 diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
 index fa8e0aa..f194c08 100644
 --- a/arch/mips/include/asm/pci.h
 +++ b/arch/mips/include/asm/pci.h
 @@ -136,11 +136,6 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev 
 *dev, int channel)
 return channel ? 15 : 14;
  }

 -#ifdef CONFIG_CPU_CAVIUM_OCTEON
 -/* MSI arch hook for OCTEON */
 -#define arch_setup_msi_irqs arch_setup_msi_irqs
 -#endif
 -
  extern char * (*pcibios_plat_setup)(char *str);

  #ifdef CONFIG_OF
 diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
 index 6653f27..95145a1 100644
 --- a/arch/powerpc/include/asm/pci.h
 +++ b/arch/powerpc/include/asm/pci.h
 @@ -113,11 +113,6 @@ extern int pci_domain_nr(struct pci_bus *bus);
  /* Decide whether to display the domain number in /proc */
  extern int pci_proc_domain(struct pci_bus *bus);

 -/* MSI arch hooks */
 -#define arch_setup_msi_irqs arch_setup_msi_irqs
 -#define arch_teardown_msi_irqs arch_teardown_msi_irqs
 -#define arch_msi_check_device arch_msi_check_device
 -
  struct vm_area_struct;
  /* Map a range of PCI memory or I/O space for a device into user space */
  int pci_mmap_page_range(struct pci_dev *pdev, struct vm_area_struct *vma,
 diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
 index 6e577ba..262b91b 100644
 --- a/arch/s390/include/asm/pci.h
 +++ b/arch/s390/include/asm/pci.h
 @@ -21,10 +21,6 @@ void pci_iounmap(struct pci_dev *, void __iomem *);
  int pci_domain_nr(struct pci_bus *);
  int pci_proc_domain(struct pci_bus *);

 -/* MSI arch hooks */
 -#define arch_setup_msi_irqsarch_setup_msi_irqs
 -#define arch_teardown_msi_irqs arch_teardown_msi_irqs
 -
  #define ZPCI_BUS_NR0   /* default bus number */
  #define ZPCI_DEVFN 0   /* default device number */

 diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
 index d9e9e6c..8c61de0 100644
 --- a/arch/x86/include/asm/pci.h
 +++ b/arch/x86/include/asm/pci.h
 @@ -100,29 +100,6 @@ static inline void early_quirks(void) { }
  extern void pci_iommu_alloc(void);

  #ifdef CONFIG_PCI_MSI
 -/* MSI arch specific hooks */
 -static inline int x86_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 -{
 -   return x86_msi.setup_msi_irqs(dev, nvec, type);
 -}
 -
 -static inline void x86_teardown_msi_irqs(struct pci_dev *dev)
 -{
 -   x86_msi.teardown_msi_irqs(dev);
 -}
 -
 -static inline void x86_teardown_msi_irq(unsigned int irq)
 -{
 -   x86_msi.teardown_msi_irq(irq);
 -}
 -static inline void x86_restore_msi_irqs(struct pci_dev *dev, int irq)
 -{
 -   x86_msi.restore_msi_irqs(dev, irq);
 -}
 -#define arch_setup_msi_irqs x86_setup_msi_irqs
 -#define arch_teardown_msi_irqs x86_teardown_msi_irqs
 -#define arch_teardown_msi_irq x86_teardown_msi_irq
 -#define arch_restore_msi_irqs x86_restore_msi_irqs
  /* implemented in arch/x86/kernel/apic/io_apic. */
  struct msi_desc;
  int native_setup_msi_irqs(struct pci_dev *dev, int 

Re: [PATCH 0/7] rapidio: modularize rapidio subsystem

2013-07-25 Thread Jean Delvare
Hi Alexandre,

Le Friday 28 June 2013 à 15:18 -0400, Alexandre Bounine a écrit :
 The following set of patches modifies kernel RapidIO subsystem to enable build
 and use of its components as loadable kernel modules. Combinations of 
 statically
 linked and modular components are also supported.
 (...)

Thanks to this patch set, I was finally able to make all RapidIO support
modular in openSUSE kernels. This is very appreciated, thanks a lot!

-- 
Jean Delvare
Suse L3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCHv5 02/11] PCI: use weak functions for MSI arch-specific functions

2013-07-25 Thread Thierry Reding
On Mon, Jul 15, 2013 at 01:52:38PM +0200, Thomas Petazzoni wrote:
 Until now, the MSI architecture-specific functions could be overloaded
 using a fairly complex set of #define and compile-time
 conditionals. In order to prepare for the introduction of the msi_chip
 infrastructure, it is desirable to switch all those functions to use
 the 'weak' mechanism. This commit converts all the architectures that
 were overidding those MSI functions to use the new strategy.
 
 Note that we keep a separate, non-weak, function
 default_teardown_msi_irqs() for the default behavior of the
 arch_teardown_msi_irqs(), as the default behavior is needed by the Xen
 x86 PCI code.
 
 Signed-off-by: Thomas Petazzoni thomas.petazz...@free-electrons.com
 Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 Cc: Paul Mackerras pau...@samba.org
 Cc: linuxppc-dev@lists.ozlabs.org
 Cc: Martin Schwidefsky schwidef...@de.ibm.com
 Cc: Heiko Carstens heiko.carst...@de.ibm.com
 Cc: linux...@de.ibm.com
 Cc: linux-s...@vger.kernel.org
 Cc: Thomas Gleixner t...@linutronix.de
 Cc: Ingo Molnar mi...@redhat.com
 Cc: H. Peter Anvin h...@zytor.com
 Cc: x...@kernel.org
 Cc: Russell King li...@arm.linux.org.uk
 Cc: Tony Luck tony.l...@intel.com
 Cc: Fenghua Yu fenghua...@intel.com
 Cc: linux-i...@vger.kernel.org
 Cc: Ralf Baechle r...@linux-mips.org
 Cc: linux-m...@linux-mips.org
 Cc: David S. Miller da...@davemloft.net
 Cc: sparcli...@vger.kernel.org
 Cc: Chris Metcalf cmetc...@tilera.com
 ---
  arch/mips/include/asm/pci.h|  5 -
  arch/powerpc/include/asm/pci.h |  5 -
  arch/s390/include/asm/pci.h|  4 
  arch/x86/include/asm/pci.h | 28 --
  arch/x86/kernel/x86_init.c | 21 
  drivers/pci/msi.c  | 45 
 +++---
  include/linux/msi.h|  7 ++-
  7 files changed, 47 insertions(+), 68 deletions(-)

Bjorn,

any chance that we can get your Acked-by on this?

Thierry

 
 diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
 index fa8e0aa..f194c08 100644
 --- a/arch/mips/include/asm/pci.h
 +++ b/arch/mips/include/asm/pci.h
 @@ -136,11 +136,6 @@ static inline int pci_get_legacy_ide_irq(struct pci_dev 
 *dev, int channel)
   return channel ? 15 : 14;
  }
  
 -#ifdef CONFIG_CPU_CAVIUM_OCTEON
 -/* MSI arch hook for OCTEON */
 -#define arch_setup_msi_irqs arch_setup_msi_irqs
 -#endif
 -
  extern char * (*pcibios_plat_setup)(char *str);
  
  #ifdef CONFIG_OF
 diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
 index 6653f27..95145a1 100644
 --- a/arch/powerpc/include/asm/pci.h
 +++ b/arch/powerpc/include/asm/pci.h
 @@ -113,11 +113,6 @@ extern int pci_domain_nr(struct pci_bus *bus);
  /* Decide whether to display the domain number in /proc */
  extern int pci_proc_domain(struct pci_bus *bus);
  
 -/* MSI arch hooks */
 -#define arch_setup_msi_irqs arch_setup_msi_irqs
 -#define arch_teardown_msi_irqs arch_teardown_msi_irqs
 -#define arch_msi_check_device arch_msi_check_device
 -
  struct vm_area_struct;
  /* Map a range of PCI memory or I/O space for a device into user space */
  int pci_mmap_page_range(struct pci_dev *pdev, struct vm_area_struct *vma,
 diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
 index 6e577ba..262b91b 100644
 --- a/arch/s390/include/asm/pci.h
 +++ b/arch/s390/include/asm/pci.h
 @@ -21,10 +21,6 @@ void pci_iounmap(struct pci_dev *, void __iomem *);
  int pci_domain_nr(struct pci_bus *);
  int pci_proc_domain(struct pci_bus *);
  
 -/* MSI arch hooks */
 -#define arch_setup_msi_irqs  arch_setup_msi_irqs
 -#define arch_teardown_msi_irqs   arch_teardown_msi_irqs
 -
  #define ZPCI_BUS_NR  0   /* default bus number */
  #define ZPCI_DEVFN   0   /* default device number */
  
 diff --git a/arch/x86/include/asm/pci.h b/arch/x86/include/asm/pci.h
 index d9e9e6c..8c61de0 100644
 --- a/arch/x86/include/asm/pci.h
 +++ b/arch/x86/include/asm/pci.h
 @@ -100,29 +100,6 @@ static inline void early_quirks(void) { }
  extern void pci_iommu_alloc(void);
  
  #ifdef CONFIG_PCI_MSI
 -/* MSI arch specific hooks */
 -static inline int x86_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 -{
 - return x86_msi.setup_msi_irqs(dev, nvec, type);
 -}
 -
 -static inline void x86_teardown_msi_irqs(struct pci_dev *dev)
 -{
 - x86_msi.teardown_msi_irqs(dev);
 -}
 -
 -static inline void x86_teardown_msi_irq(unsigned int irq)
 -{
 - x86_msi.teardown_msi_irq(irq);
 -}
 -static inline void x86_restore_msi_irqs(struct pci_dev *dev, int irq)
 -{
 - x86_msi.restore_msi_irqs(dev, irq);
 -}
 -#define arch_setup_msi_irqs x86_setup_msi_irqs
 -#define arch_teardown_msi_irqs x86_teardown_msi_irqs
 -#define arch_teardown_msi_irq x86_teardown_msi_irq
 -#define arch_restore_msi_irqs x86_restore_msi_irqs
  /* implemented in arch/x86/kernel/apic/io_apic. */
  struct msi_desc;
  int native_setup_msi_irqs(struct pci_dev *dev, int 

Re: [RFC 11/14] powerpc: Eliminate NO_IRQ usage

2013-07-25 Thread Geert Uytterhoeven
On Wed, Jan 11, 2012 at 9:22 PM, Grant Likely grant.lik...@secretlab.ca wrote:
 NO_IRQ is evil.  Stop using it in arch/powerpc and powerpc device drivers

 diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
 index 3e06696..55c6ff9 100644
 --- a/sound/soc/fsl/fsl_ssi.c
 +++ b/sound/soc/fsl/fsl_ssi.c
 @@ -666,7 +666,7 @@ static int __devinit fsl_ssi_probe(struct platform_device 
 *pdev)
 ssi_private-ssi_phys = res.start;

 ssi_private-irq = irq_of_parse_and_map(np, 0);
 -   if (ssi_private-irq == NO_IRQ) {
 +   if (!ssi_private-irq) {
 dev_err(pdev-dev, no irq for node %s\n, np-full_name);
 ret = -ENXIO;
 goto error_iomap;

What's the plan with this patch?

This is now failing on xtensa, as it's one of the architectures that doesn't
define NO_IRQ. Only arm, c6x, mn10300, openrisc, parisc, powerpc, and sparc
define it.

sound/soc/fsl/fsl_ssi.c:705:26: error: 'NO_IRQ' undeclared (first use
in this function)
make[4]: *** [sound/soc/fsl/fsl_ssi.o] Error 1

http://kisskb.ellerman.id.au/kisskb/buildresult/9187959/

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/2] powerpc: Add smp_generic_cpu_bootable

2013-07-25 Thread Andy Fleming
Cell and PSeries both implemented their own versions of a
cpu_bootable smp_op which do the same thing (well, the PSeries
one has support for more than 2 threads). Copy the PSeries one
to generic code, and rename it smp_generic_cpu_bootable.

Signed-off-by: Andy Fleming aflem...@freescale.com
---
 arch/powerpc/include/asm/smp.h |2 ++
 arch/powerpc/kernel/smp.c  |   23 +++
 2 files changed, 25 insertions(+)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index ffbaabe..f2b5d41 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -182,6 +182,8 @@ extern int smt_enabled_at_boot;
 extern int smp_mpic_probe(void);
 extern void smp_mpic_setup_cpu(int cpu);
 extern int smp_generic_kick_cpu(int nr);
+extern int smp_generic_cpu_bootable(unsigned int nr);
+
 
 extern void smp_generic_give_timebase(void);
 extern void smp_generic_take_timebase(void);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 38b0ba6..3cd42aa 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -81,6 +81,29 @@ int smt_enabled_at_boot = 1;
 
 static void (*crash_ipi_function_ptr)(struct pt_regs *) = NULL;
 
+/*
+ * Returns 1 if the specified cpu should be brought up during boot.
+ * Used to inhibit booting threads if they've been disabled or
+ * limited on the command line
+ */
+int smp_generic_cpu_bootable(unsigned int nr)
+{
+   /* Special case - we inhibit secondary thread startup
+* during boot if the user requests it.
+*/
+   if (system_state == SYSTEM_BOOTING  cpu_has_feature(CPU_FTR_SMT)) {
+   if (!smt_enabled_at_boot  cpu_thread_in_core(nr) != 0)
+   return 0;
+   if (smt_enabled_at_boot
+cpu_thread_in_core(nr) = smt_enabled_at_boot)
+   return 0;
+   }
+
+   return 1;
+}
+EXPORT_SYMBOL(smp_generic_cpu_bootable);
+
+
 #ifdef CONFIG_PPC64
 int smp_generic_kick_cpu(int nr)
 {
-- 
1.7.9.7


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/2] powerpc: Convert platforms to smp_generic_cpu_bootable

2013-07-25 Thread Andy Fleming
T4, Cell, powernv, and pseries had the same implementation, so switch
them to use a generic version. A2 apparently had a version, but
removed it at some point, so we remove the declaration, too.

Signed-off-by: Andy Fleming aflem...@freescale.com

Conflicts:

arch/powerpc/platforms/cell/smp.c
arch/powerpc/platforms/powernv/smp.c
arch/powerpc/platforms/pseries/smp.c

Change-Id: If12e2f83f7187ee5982dca9f89c68e0600f0cc49
---
 arch/powerpc/platforms/85xx/smp.c|1 +
 arch/powerpc/platforms/cell/smp.c|   15 +--
 arch/powerpc/platforms/powernv/smp.c |   18 +-
 arch/powerpc/platforms/pseries/smp.c |   18 +-
 arch/powerpc/platforms/wsp/wsp.h |1 -
 5 files changed, 4 insertions(+), 49 deletions(-)

diff --git a/arch/powerpc/platforms/85xx/smp.c 
b/arch/powerpc/platforms/85xx/smp.c
index 5ced4f5..ea9c626 100644
--- a/arch/powerpc/platforms/85xx/smp.c
+++ b/arch/powerpc/platforms/85xx/smp.c
@@ -255,6 +255,7 @@ out:
 
 struct smp_ops_t smp_85xx_ops = {
.kick_cpu = smp_85xx_kick_cpu,
+   .cpu_bootable = smp_generic_cpu_bootable,
 #ifdef CONFIG_HOTPLUG_CPU
.cpu_disable= generic_cpu_disable,
.cpu_die= generic_cpu_die,
diff --git a/arch/powerpc/platforms/cell/smp.c 
b/arch/powerpc/platforms/cell/smp.c
index f75f6fc..90745ea 100644
--- a/arch/powerpc/platforms/cell/smp.c
+++ b/arch/powerpc/platforms/cell/smp.c
@@ -136,25 +136,12 @@ static int smp_cell_kick_cpu(int nr)
return 0;
 }
 
-static int smp_cell_cpu_bootable(unsigned int nr)
-{
-   /* Special case - we inhibit secondary thread startup
-* during boot if the user requests it.  Odd-numbered
-* cpus are assumed to be secondary threads.
-*/
-   if (system_state == SYSTEM_BOOTING 
-   cpu_has_feature(CPU_FTR_SMT) 
-   !smt_enabled_at_boot  cpu_thread_in_core(nr) != 0)
-   return 0;
-
-   return 1;
-}
 static struct smp_ops_t bpa_iic_smp_ops = {
.message_pass   = iic_message_pass,
.probe  = smp_iic_probe,
.kick_cpu   = smp_cell_kick_cpu,
.setup_cpu  = smp_cell_setup_cpu,
-   .cpu_bootable   = smp_cell_cpu_bootable,
+   .cpu_bootable   = smp_generic_cpu_bootable,
 };
 
 /* This is called very early */
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index 89e3857..908672b 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -46,22 +46,6 @@ static void pnv_smp_setup_cpu(int cpu)
xics_setup_cpu();
 }
 
-static int pnv_smp_cpu_bootable(unsigned int nr)
-{
-   /* Special case - we inhibit secondary thread startup
-* during boot if the user requests it.
-*/
-   if (system_state == SYSTEM_BOOTING  cpu_has_feature(CPU_FTR_SMT)) {
-   if (!smt_enabled_at_boot  cpu_thread_in_core(nr) != 0)
-   return 0;
-   if (smt_enabled_at_boot
-cpu_thread_in_core(nr) = smt_enabled_at_boot)
-   return 0;
-   }
-
-   return 1;
-}
-
 int pnv_smp_kick_cpu(int nr)
 {
unsigned int pcpu = get_hard_smp_processor_id(nr);
@@ -195,7 +179,7 @@ static struct smp_ops_t pnv_smp_ops = {
.probe  = xics_smp_probe,
.kick_cpu   = pnv_smp_kick_cpu,
.setup_cpu  = pnv_smp_setup_cpu,
-   .cpu_bootable   = pnv_smp_cpu_bootable,
+   .cpu_bootable   = smp_generic_cpu_bootable,
 #ifdef CONFIG_HOTPLUG_CPU
.cpu_disable= pnv_smp_cpu_disable,
.cpu_die= generic_cpu_die,
diff --git a/arch/powerpc/platforms/pseries/smp.c 
b/arch/powerpc/platforms/pseries/smp.c
index 306643c..ca2d1f6 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -187,22 +187,6 @@ static int smp_pSeries_kick_cpu(int nr)
return 0;
 }
 
-static int smp_pSeries_cpu_bootable(unsigned int nr)
-{
-   /* Special case - we inhibit secondary thread startup
-* during boot if the user requests it.
-*/
-   if (system_state == SYSTEM_BOOTING  cpu_has_feature(CPU_FTR_SMT)) {
-   if (!smt_enabled_at_boot  cpu_thread_in_core(nr) != 0)
-   return 0;
-   if (smt_enabled_at_boot
-cpu_thread_in_core(nr) = smt_enabled_at_boot)
-   return 0;
-   }
-
-   return 1;
-}
-
 /* Only used on systems that support multiple IPI mechanisms */
 static void pSeries_cause_ipi_mux(int cpu, unsigned long data)
 {
@@ -237,7 +221,7 @@ static struct smp_ops_t pSeries_xics_smp_ops = {
.probe  = pSeries_smp_probe,
.kick_cpu   = smp_pSeries_kick_cpu,
.setup_cpu  = smp_xics_setup_cpu,
-   .cpu_bootable   = smp_pSeries_cpu_bootable,
+   .cpu_bootable   = smp_generic_cpu_bootable,
 };
 
 /* This is called very early */
diff --git 

Re: [PATCH] module: ppc64 module CRC relocation fix causes perf issues

2013-07-25 Thread Anton Blanchard

Hi Neil,

 Sorry I'm a bit late to the thread, I've ben swamped.  Has someone
 tested this with kexec/kdump?  Thats why the origional patch was
 created, because when kexec loads the kernel at a different physical
 address, the relocations messed with the module crc's, and modules
 couldn't load during the kexec boot.  Assuming that kernaddr_start
 gets set appropriately during boot, using PHYSICAL_START should be
 fine, but I wanted to check, and don't currently have access to a
 powerpc system to do so. Neil

I tested a relocatable kernel forced to run at a non zero physical
address (ie basically kdump). I verified CRCs were bad with your
original patch backed out, and were good with this patch applied.

Anton
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


SIGSTKSZ/MINSIGSTKSZ too small on 64bit

2013-07-25 Thread Anton Blanchard

Hi,

Alan has been looking at a glibc test fail. His analysis shows SEGVs
in signal handlers using sigaltstack, and that MINSIGSTKSZ and SIGSTKSZ
are too small.

We increased the size of rt_sigframe in commit 2b0a576d15e0
(powerpc: Add new transactional memory state to the signal context) but
didn't bump either SIGSTKSZ and MINSIGSTKSZ. We need to do that in both
the kernel and glibc, but I'm a bit worried we could have broken
existing applications that use sigaltstack.

Anton
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Preeti U Murthy
Hi Frederic,

On 07/25/2013 07:00 PM, Frederic Weisbecker wrote:
 On Thu, Jul 25, 2013 at 02:33:02PM +0530, Preeti U Murthy wrote:
 In the current design of timer offload framework, the broadcast cpu should
 *not* go into tickless idle so as to avoid missed wakeups on CPUs in deep 
 idle states.

 Since we prevent the CPUs entering deep idle states from programming the 
 lapic of the
 broadcast cpu for their respective next local events for reasons mentioned in
 PATCH[3/5], the broadcast CPU checks if there are any CPUs to be woken up 
 during
 each of its timer interrupt programmed to its local events.

 With tickless idle, the broadcast CPU might not get a timer interrupt till 
 after
 many ticks which can result in missed wakeups on CPUs in deep idle states. By
 disabling tickless idle, worst case, the tick_sched hrtimer will trigger a
 timer interrupt every period to check for broadcast.

 However the current setup of tickless idle does not let us make the choice
 of tickless on individual cpus. NOHZ_MODE_INACTIVE which disables tickless 
 idle,
 is a system wide setting. Hence resort to an arch specific call to check if 
 a cpu
 can go into tickless idle.
 
 Hi Preeti,
 
 I'm not exactly sure why you can't enter the broadcast CPU in dynticks idle 
 mode.
 I read in the previous patch that's because in dynticks idle mode the 
 broadcast
 CPU deactivates its lapic so it doesn't receive the IPI. But may be I 
 misunderstood.
 Anyway that's not good for powersaving.

Let me elaborate. The CPUs in deep idle states have their lapics
deactivated. This means the next timer event which would typically have
been taken care of by a lapic firing at the appropriate moment does not
get taken care of in deep idle states, due to the lapic being switched off.

Hence such CPUs offload their next timer event to the broadcast CPU,
which should *not* enter deep idle states. The broadcast CPU has the
responsibility of waking the CPUs in deep idle states.

*The lapic of a broadcast CPU is active always*. Say CPUX, wants the
broadcast CPU to wake it up at timeX.  Since we cannot program the lapic
of a remote CPU, CPUX will need to send an IPI to the broadcast CPU,
asking it to program its lapic to fire at timeX so as to wake up CPUX.
*With multiple CPUs the overhead of sending IPI, could result in
performance bottlenecks and may not scale well.*

Hence the workaround is that the broadcast CPU on each of its timer
interrupt checks if any of the next timer event of a CPU in deep idle
state has expired, which can very well be found from dev-next_event of
that CPU. For example the timeX that has been mentioned above has
expired. If so the broadcast handler is called to send an IPI to the
idling CPU to wake it up.

*If the broadcast CPU, is in tickless idle, its timer interrupt could be
many ticks away. It could miss waking up a CPU in deep idle*, if its
wakeup is much before this timer interrupt of the broadcast CPU. But
without tickless idle, atleast at each period we are assured of a timer
interrupt. At which time broadcast handling is done as stated in the
previous paragraph and we will not miss wakeup of CPUs in deep idle states.

Yeah it is true that not allowing the broadcast CPU to enter tickless
idle is bad for power savings, but for the use case that we are aiming
at in this patch series, the current approach seems to be the best, with
minimal trade-offs in performance, power savings, scalability and no
change in the broadcast framework that exists today in the kernel.

 
 Also when an arch wants to prevent a CPU from entering dynticks idle mode, it 
 typically
 use arch_needs_cpu(). May be that could fit for you as well?

Oh ok thanks :) I will look into this and get back on if we can use it.

Regards
Preeti U Murthy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Preeti U Murthy
Hi Frederic,

On 07/25/2013 07:00 PM, Frederic Weisbecker wrote:
 On Thu, Jul 25, 2013 at 02:33:02PM +0530, Preeti U Murthy wrote:
 In the current design of timer offload framework, the broadcast cpu should
 *not* go into tickless idle so as to avoid missed wakeups on CPUs in deep 
 idle states.

 Since we prevent the CPUs entering deep idle states from programming the 
 lapic of the
 broadcast cpu for their respective next local events for reasons mentioned in
 PATCH[3/5], the broadcast CPU checks if there are any CPUs to be woken up 
 during
 each of its timer interrupt programmed to its local events.

 With tickless idle, the broadcast CPU might not get a timer interrupt till 
 after
 many ticks which can result in missed wakeups on CPUs in deep idle states. By
 disabling tickless idle, worst case, the tick_sched hrtimer will trigger a
 timer interrupt every period to check for broadcast.

 However the current setup of tickless idle does not let us make the choice
 of tickless on individual cpus. NOHZ_MODE_INACTIVE which disables tickless 
 idle,
 is a system wide setting. Hence resort to an arch specific call to check if 
 a cpu
 can go into tickless idle.
 
 Hi Preeti,
 
 I'm not exactly sure why you can't enter the broadcast CPU in dynticks idle 
 mode.
 I read in the previous patch that's because in dynticks idle mode the 
 broadcast
 CPU deactivates its lapic so it doesn't receive the IPI. But may be I 
 misunderstood.
 Anyway that's not good for powersaving.
 
 Also when an arch wants to prevent a CPU from entering dynticks idle mode, it 
 typically
 use arch_needs_cpu(). May be that could fit for you as well?

Yes this will suit our requirement perfectly. I will note down this
change for the next version of this patchset. Thank you very much for
pointing this out :)

Regards
Preeti U Murthy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Paul Mackerras
On Fri, Jul 26, 2013 at 08:09:23AM +0530, Preeti U Murthy wrote:
 Hi Frederic,
 
 On 07/25/2013 07:00 PM, Frederic Weisbecker wrote:
  Hi Preeti,
  
  I'm not exactly sure why you can't enter the broadcast CPU in dynticks idle 
  mode.
  I read in the previous patch that's because in dynticks idle mode the 
  broadcast
  CPU deactivates its lapic so it doesn't receive the IPI. But may be I 
  misunderstood.
  Anyway that's not good for powersaving.
 
 Let me elaborate. The CPUs in deep idle states have their lapics
 deactivated. This means the next timer event which would typically have
 been taken care of by a lapic firing at the appropriate moment does not
 get taken care of in deep idle states, due to the lapic being switched off.

I really don't think it's helpful to use the term lapic in
connection with Power systems.  There is nothing that is called a
lapic in a Power machine.  The nearest equivalent of the LAPIC on
x86 machines is the ICP, the interrupt-controller presentation
element, of which there is one per CPU thread.

However, I don't believe the ICP gets disabled in deep sleep modes.
What does get disabled is the decrementer, which is a register that
normally counts down (at 512MHz) and generates an exception when it is
negative.  The decrementer *is* part of the CPU core, unlike the ICP.
That's why we can still get IPIs but not timer interrupts.

Please reword your patch description to not use the term lapic,
which is not defined in the Power context and is therefore just
causing confusion.

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH v2 3/3] powerpc/85xx: Add C293PCIE board support

2013-07-25 Thread Po Liu
From: Mingkai Hu mingkai...@freescale.com

C293PCIE board is a series of Freescale PCIe add-in cards to perform
as public key crypto accelerator or secure key management module.

 - 512KB platform SRAM in addition to 512K L2 Cache/SRAM
 - 512MB soldered DDR3 32bit memory
 - CPLD System Logic
 - 64MB x16 NOR flash and 4GB x8 NAND flash
 - 16MB SPI flash

Signed-off-by: Mingkai Hu mingkai...@freescale.com
Signed-off-by: Po Liu po@freescale.com
---
Changes for v2:
- Remove the JFFS2 partitions in NOR/NAND/SPI flash;
- Implement the NAND partitions;
- Remove the no use descriptions for cpld node;
- Add mpc85xx_smp_defconfig and mpc85xx_defconfig for C293;
- Remove the no use includes in c293pcie.c


 arch/powerpc/boot/dts/c293pcie.dts | 243 +
 arch/powerpc/configs/mpc85xx_defconfig |   1 +
 arch/powerpc/configs/mpc85xx_smp_defconfig |   1 +
 arch/powerpc/platforms/85xx/Kconfig|   6 +
 arch/powerpc/platforms/85xx/Makefile   |   1 +
 arch/powerpc/platforms/85xx/c293pcie.c |  75 +
 6 files changed, 327 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/c293pcie.dts
 create mode 100644 arch/powerpc/platforms/85xx/c293pcie.c

diff --git a/arch/powerpc/boot/dts/c293pcie.dts 
b/arch/powerpc/boot/dts/c293pcie.dts
new file mode 100644
index 000..dc91c47
--- /dev/null
+++ b/arch/powerpc/boot/dts/c293pcie.dts
@@ -0,0 +1,243 @@
+/*
+ * C293 PCIE Device Tree Source
+ *
+ * Copyright 2013 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor AS IS AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/include/ fsl/c293si-pre.dtsi
+
+/ {
+   model = fsl,C293PCIE;
+   compatible = fsl,C293PCIE;
+
+   memory {
+   device_type = memory;
+   };
+
+   ifc: ifc@fffe1e000 {
+   reg = 0xf 0xffe1e000 0 0x2000;
+   ranges = 0x0 0x0 0xf 0xec00 0x0400
+ 0x2 0x0 0xf 0xffdf 0x0001;
+
+   };
+
+   soc: soc@fffe0 {
+   ranges = 0x0 0xf 0xffe0 0x10;
+   };
+
+   pci0: pcie@fffe0a000 {
+   reg = 0xf 0xffe0a000 0 0x1000;
+   ranges = 0x200 0x0 0x8000 0xc 0x 0x0 0x2000
+ 0x100 0x0 0x 0xf 0xffc0 0x0 0x1;
+   pcie@0 {
+   ranges = 0x200 0x0 0x8000
+ 0x200 0x0 0x8000
+ 0x0 0x2000
+
+ 0x100 0x0 0x0
+ 0x100 0x0 0x0
+ 0x0 0x10;
+   };
+   };
+};
+
+ifc {
+   nor@0,0 {
+   #address-cells = 1;
+   #size-cells = 1;
+   compatible = cfi-flash;
+   reg = 0x0 0x0 0x400;
+   bank-width = 2;
+   device-width = 1;
+
+   partition@0 {
+   /* 1MB for DTB Image */
+   reg = 0x0 0x0010;
+   label = NOR DTB Image;
+   };
+
+   partition@10 {
+   /* 8 MB for Linux Kernel Image */
+   reg = 0x0010 0x0080;
+ 

[PATCH v2 1/3] powerpc/85xx: Add SEC6.0 device tree

2013-07-25 Thread Po Liu
From: Mingkai Hu mingkai...@freescale.com

Add device tree for SEC 6.0 used on C29x silicon.

Signed-off-by: Mingkai Hu mingkai...@freescale.com
Signed-off-by: Po Liu po@freescale.com
---
Changes for v2:
- Remove the compatible sec v4.0/v4.4/v5.0;
- Add the device tree binding file fsl-sec6.txt;

 .../devicetree/bindings/crypto/fsl-sec6.txt| 162 +
 arch/powerpc/boot/dts/fsl/qoriq-sec6.0-0.dtsi  |  56 +++
 2 files changed, 218 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/crypto/fsl-sec6.txt
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-sec6.0-0.dtsi

diff --git a/Documentation/devicetree/bindings/crypto/fsl-sec6.txt 
b/Documentation/devicetree/bindings/crypto/fsl-sec6.txt
new file mode 100644
index 000..f6d2a69
--- /dev/null
+++ b/Documentation/devicetree/bindings/crypto/fsl-sec6.txt
@@ -0,0 +1,162 @@
+SEC 6 is as Freescale's Cryptographic Accelerator and Assurance Module (CAAM).
+Currently Freescale powerpc chip C29X is embeded with SEC 6. 
+SEC 6 device tree binding include:
+   -SEC 6 Node
+   -Job Ring Node
+   -Full Example
+
+=
+SEC 6 Node
+
+Description
+
+Node defines the base address of the SEC 6 block.
+This block specifies the address range of all global
+configuration registers for the SEC 6 block.
+For example, In C293, we could see three SEC 6 node.
+
+PROPERTIES
+
+   - compatible
+  Usage: required
+  Value type: string
+  Definition: Must include fsl,sec-v6.0
+
+   - fsl,sec-era
+  Usage: optional
+  Value type: u32
+  Definition: A standard property. Define the 'ERA' of the SEC
+  device.
+
+   - #address-cells
+   Usage: required
+   Value type: u32
+   Definition: A standard property.  Defines the number of cells
+   for representing physical addresses in child nodes.
+
+   - #size-cells
+   Usage: required
+   Value type: u32
+   Definition: A standard property.  Defines the number of cells
+   for representing the size of physical addresses in
+   child nodes.
+
+   - reg
+  Usage: required
+  Value type: prop-encoded-array
+  Definition: A standard property.  Specifies the physical
+  address and length of the SEC 6 configuration registers.
+  registers
+
+   - ranges
+   Usage: required
+   Value type: prop-encoded-array
+   Definition: A standard property.  Specifies the physical address
+   range of the SEC 6.0 register space (-SNVS not included).  A
+   triplet that includes the child address, parent address, 
+   length.
+
+   Note: All other standard properties (see the ePAPR) are allowed
+   but are optional.
+
+
+EXAMPLE
+   crypto@a {
+   compatible = fsl,sec-v6.0;
+   fsl,sec-era = 6;
+   #address-cells = 1;
+   #size-cells = 1;
+   reg = 0xa 0x2;
+   ranges = 0 0xa 0x2;
+   };
+
+=
+Job Ring (JR) Node
+
+Child of the crypto node defines data processing interface to SEC 6 
+across the peripheral bus for purposes of processing
+cryptographic descriptors. The specified address
+range can be made visible to one (or more) cores.
+The interrupt defined for this node is controlled within
+the address range of this node.
+
+  - compatible
+  Usage: required
+  Value type: string
+  Definition: Must include fsl,sec-v6.0-job-ring, if it is
+  back compatible with old version, better add them all.
+
+  - reg
+  Usage: required
+  Value type: prop-encoded-array
+  Definition: Specifies a two JR parameters:  an offset from
+  the parent physical address and the length the JR registers.
+
+   - interrupts
+  Usage: required
+  Value type: prop_encoded-array
+  Definition:  Specifies the interrupts generated by this
+   device.  The value of the interrupts property
+   consists of one interrupt specifier. The format
+   of the specifier is defined by the binding document
+   describing the node's interrupt parent.
+
+EXAMPLE
+   jr@1000 {
+   compatible = fsl,sec-v6.0-job-ring;
+   reg = 0x1000 0x1000;
+   interrupts = 49 2 0 0;
+   };
+
+===
+Full Example
+
+Since some chips may embeded with more than one SEC 6, we abstract
+all the same properties into one file qoriq-sec6.0-0.dtsi. Each chip
+want to binding the node could simply include it in its own device
+node tree. Below is full example in C293PCIE:
+
+In qoriq-sec6.0-0.dtsi:
+
+   compatible = fsl,sec-v6.0;
+   fsl,sec-era = 6;
+   #address-cells = 1;
+   #size-cells = 1;
+
+   jr@1000 {
+   compatible = 

[PATCH v2 2/3] powerpc/85xx: Add silicon device tree for C293

2013-07-25 Thread Po Liu
From: Mingkai Hu mingkai...@freescale.com

Signed-off-by: Mingkai Hu mingkai...@freescale.com
Signed-off-by: Po Liu po@freescale.com
---
Changes for v2:
- None

 arch/powerpc/boot/dts/fsl/c293si-post.dtsi | 193 +
 arch/powerpc/boot/dts/fsl/c293si-pre.dtsi  |  63 ++
 2 files changed, 256 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/fsl/c293si-post.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/c293si-pre.dtsi

diff --git a/arch/powerpc/boot/dts/fsl/c293si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/c293si-post.dtsi
new file mode 100644
index 000..bd20832
--- /dev/null
+++ b/arch/powerpc/boot/dts/fsl/c293si-post.dtsi
@@ -0,0 +1,193 @@
+/*
+ * C293 Silicon/SoC Device Tree Source (post include)
+ *
+ * Copyright 2012 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in the
+ *   documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *   names of its contributors may be used to endorse or promote products
+ *   derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License (GPL) as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+ifc {
+   #address-cells = 2;
+   #size-cells = 1;
+   compatible = fsl,ifc, simple-bus;
+   interrupts = 19 2 0 0;
+};
+
+/* controller at 0xa000 */
+pci0 {
+   compatible = fsl,qoriq-pcie-v2.2, fsl,qoriq-pcie;
+   device_type = pci;
+   #size-cells = 2;
+   #address-cells = 3;
+   bus-range = 0 255;
+   clock-frequency = ;
+   interrupts = 16 2 0 0;
+
+   pcie@0 {
+   reg = 0 0 0 0 0;
+   #interrupt-cells = 1;
+   #size-cells = 2;
+   #address-cells = 3;
+   device_type = pci;
+   interrupts = 16 2 0 0;
+   interrupt-map-mask = 0xf800 0 0 7;
+   interrupt-map = 
+   /* IDSEL 0x0 */
+    0x0 0x0 0x1 mpic 0x0 0x1 0x0 0x0
+    0x0 0x0 0x2 mpic 0x1 0x1 0x0 0x0
+    0x0 0x0 0x3 mpic 0x2 0x1 0x0 0x0
+    0x0 0x0 0x4 mpic 0x3 0x1 0x0 0x0
+   ;
+   };
+};
+
+soc {
+   #address-cells = 1;
+   #size-cells = 1;
+   device_type = soc;
+   compatible = simple-bus;
+   bus-frequency = 0;// Filled out by uboot.
+
+   ecm-law@0 {
+   compatible = fsl,ecm-law;
+   reg = 0x0 0x1000;
+   fsl,num-laws = 12;
+   };
+
+   ecm@1000 {
+   compatible = fsl,c293-ecm, fsl,ecm;
+   reg = 0x1000 0x1000;
+   interrupts = 16 2 0 0;
+   };
+
+   memory-controller@2000 {
+   compatible = fsl,c293-memory-controller;
+   reg = 0x2000 0x1000;
+   interrupts = 16 2 0 0;
+   };
+
+/include/ pq3-i2c-0.dtsi
+/include/ pq3-i2c-1.dtsi
+/include/ pq3-duart-0.dtsi
+/include/ pq3-espi-0.dtsi
+   spi0: spi@7000 {
+   fsl,espi-num-chipselects = 1;
+   };
+
+/include/ pq3-gpio-0.dtsi
+   L2: l2-cache-controller@2 {
+   compatible = fsl,c293-l2-cache-controller;
+   reg = 0x2 0x1000;
+   cache-line-size = 32; // 32 bytes
+   cache-size = 0x8; // L2,512K
+   interrupts = 16 2 0 0;
+   };
+
+/include/ pq3-dma-0.dtsi
+/include/ pq3-esdhc-0.dtsi
+   sdhc@2e000 {
+   compatible = fsl,c293-esdhc, fsl,esdhc;
+   sdhci,auto-cmd12;
+   

Re: [RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Preeti U Murthy
Hi Paul,

On 07/26/2013 08:49 AM, Paul Mackerras wrote:
 On Fri, Jul 26, 2013 at 08:09:23AM +0530, Preeti U Murthy wrote:
 Hi Frederic,

 On 07/25/2013 07:00 PM, Frederic Weisbecker wrote:
 Hi Preeti,

 I'm not exactly sure why you can't enter the broadcast CPU in dynticks idle 
 mode.
 I read in the previous patch that's because in dynticks idle mode the 
 broadcast
 CPU deactivates its lapic so it doesn't receive the IPI. But may be I 
 misunderstood.
 Anyway that's not good for powersaving.

 Let me elaborate. The CPUs in deep idle states have their lapics
 deactivated. This means the next timer event which would typically have
 been taken care of by a lapic firing at the appropriate moment does not
 get taken care of in deep idle states, due to the lapic being switched off.
 
 I really don't think it's helpful to use the term lapic in
 connection with Power systems.  There is nothing that is called a
 lapic in a Power machine.  The nearest equivalent of the LAPIC on
 x86 machines is the ICP, the interrupt-controller presentation
 element, of which there is one per CPU thread.
 
 However, I don't believe the ICP gets disabled in deep sleep modes.
 What does get disabled is the decrementer, which is a register that
 normally counts down (at 512MHz) and generates an exception when it is
 negative.  The decrementer *is* part of the CPU core, unlike the ICP.
 That's why we can still get IPIs but not timer interrupts.
 
 Please reword your patch description to not use the term lapic,
 which is not defined in the Power context and is therefore just
 causing confusion.

Noted. Thank you :) I will probably send out a fresh patchset with the
appropriate changelog to avoid this confusion ?
 
 Paul.
 
Regards
Preeti U murthy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC 11/14] powerpc: Eliminate NO_IRQ usage

2013-07-25 Thread Grant Likely
On Thu, Jul 25, 2013 at 3:58 PM, Geert Uytterhoeven
ge...@linux-m68k.org wrote:
 On Wed, Jan 11, 2012 at 9:22 PM, Grant Likely grant.lik...@secretlab.ca 
 wrote:
 NO_IRQ is evil.  Stop using it in arch/powerpc and powerpc device drivers

 diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
 index 3e06696..55c6ff9 100644
 --- a/sound/soc/fsl/fsl_ssi.c
 +++ b/sound/soc/fsl/fsl_ssi.c
 @@ -666,7 +666,7 @@ static int __devinit fsl_ssi_probe(struct 
 platform_device *pdev)
 ssi_private-ssi_phys = res.start;

 ssi_private-irq = irq_of_parse_and_map(np, 0);
 -   if (ssi_private-irq == NO_IRQ) {
 +   if (!ssi_private-irq) {
 dev_err(pdev-dev, no irq for node %s\n, np-full_name);
 ret = -ENXIO;
 goto error_iomap;

 What's the plan with this patch?

 This is now failing on xtensa, as it's one of the architectures that doesn't
 define NO_IRQ. Only arm, c6x, mn10300, openrisc, parisc, powerpc, and sparc
 define it.

Wow. I'd pretty much dropped that patch because I didn't have time to
chase it down. It should be pursued though.

In that particular case it is safe I think to apply the change. PPC
defines NO_IRQ to be 0 anyway.

g.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [Suggestion] powerpc: xmon: about 'longjmp' related warning.

2013-07-25 Thread Chen Gang
On 07/24/2013 08:38 AM, Chen Gang wrote:
 On 07/23/2013 09:58 PM, Michael Ellerman wrote:
 On Mon, Jul 22, 2013 at 03:02:53PM +0800, Chen Gang wrote:
 Hello Maintainers:

 With allmodconfig and EXTRA_CFLAGS=-W, it reports warnings below:


 arch/powerpc/xmon/xmon.c:3027:6: warning: variable ‘i’ might be clobbered 
 by ‘longjmp’ or ‘vfork’ [-Wclobbered]
 arch/powerpc/xmon/xmon.c:3068:6: warning: variable ‘i’ might be clobbered 
 by ‘longjmp’ or ‘vfork’ [-Wclobbered]

 In both these cases we are inside the body of a for loop and we do a
 if (setjmp) / else block. Although looking at the source the value of i
 is not modified by the setjmp, I guess it's possible that the compiler
 might reorder the increment of i inside the setjmp and loose the value
 when we longjmp.

 
 I should continue to confirm the details based on your valuable
 information, thanks.
 
 


For stop_spus() and restart_spus(), at least now, the related warnings
are not issue: the variable 'i' is stored in stack 120(r1).

The related warning:

  arch/powerpc/xmon/xmon.c:3027:6: warning: variable ‘i’ might be clobbered by 
‘longjmp’ or ‘vfork’ [-Wclobbered]
  arch/powerpc/xmon/xmon.c:3068:6: warning: variable ‘i’ might be clobbered by 
‘longjmp’ or ‘vfork’ [-Wclobbered]

The related source code:

3024 static void stop_spus(void)
3025 {
3026 struct spu *spu;
3027 int i;
3028 u64 tmp;
3029 
3030 for (i = 0; i  XMON_NUM_SPUS; i++) { /* XMON_NUM_SPUS == 16 */
3031 if (!spu_info[i].spu)
3032 continue;
3033 
3034 if (setjmp(bus_error_jmp) == 0) {
3035 catch_memory_errors = 1;
3036 sync();
3037 
3038 spu = spu_info[i].spu;
3039 
3040 spu_info[i].saved_spu_runcntl_RW =
3041 in_be32(spu-problem-spu_runcntl_RW);
3042 
3043 tmp = spu_mfc_sr1_get(spu);
3044 spu_info[i].saved_mfc_sr1_RW = tmp;
3045 
3046 tmp = ~MFC_STATE1_MASTER_RUN_CONTROL_MASK;
3047 spu_mfc_sr1_set(spu, tmp);
3048 
3049 sync();
3050 __delay(200);
3051 
3052 spu_info[i].stopped_ok = 1;
3053 
3054 printf(Stopped spu %.2d (was %s)\n, i,
3055 spu_info[i].saved_spu_runcntl_RW ?
3056 running : stopped);
3057 } else {
3058 catch_memory_errors = 0;
3059 printf(*** Error stopping spu %.2d\n, i);
3060 }
3061 catch_memory_errors = 0;
3062 }
3063 }
3064 



The related disassembly code:

  make ARCH=powerpc EXTRA_CFLAGS=-W
  powerpc64-linux-gnu-objdump -d vmlinux.o  vmlinux.S
  gcc version 4.7.1 20120606 (Red Hat 4.7.1-0.1.20120606) (GCC)
  GNU objdump version 2.23.51.0.3-1.fc16 20120918

c007cfd0 .stop_spus:
/* { */
c007cfd0:   7c 08 02 a6 mflrr0
c007cfd4:   fb c1 ff f0 std r30,-16(r1)
c007cfd8:   fb e1 ff f8 std r31,-8(r1)
c007cfdc:   3d 22 00 0f addis   r9,r2,15
c007cfe0:   39 29 3e 10 addir9,r9,15888
c007cfe4:   3d 02 ff d4 addis   r8,r2,-44
c007cfe8:   39 29 21 50 addir9,r9,8528
c007cfec:   3d 42 ff d4 addis   r10,r2,-44
c007cff0:   39 08 83 f8 addir8,r8,-31752
c007cff4:   39 4a 83 d8 addir10,r10,-31784
c007cff8:   f8 01 00 10 std r0,16(r1)
c007cffc:   f8 21 ff 51 stdur1,-176(r1)
c007d000:   f9 21 00 70 std r9,112(r1)
c007d004:   39 20 00 00 li  r9,0
c007d008:   f9 21 00 78 std r9,120(r1)  ; i = 0;
c007d00c:   f9 01 00 88 std r8,136(r1)
c007d010:   f9 41 00 90 std r10,144(r1)
c007d014:   48 00 01 28 b   c007d13c 
.stop_spus+0x16c
c007d018:   60 00 00 00 nop
c007d01c:   60 00 00 00 nop


/*  if (setjmp(bus_error_jmp) == 0) { */
c007d020:   3d 22 00 0f addis   r9,r2,15
c007d024:   39 40 00 01 li  r10,1
c007d028:   39 29 3e 10 addir9,r9,15888
c007d02c:   91 49 20 c0 stw r10,8384(r9)
c007d030:   7c 00 04 ac sync
c007d034:   4c 00 01 2c isync
c007d038:   e9 01 00 80 ld  r8,128(r1)
c007d03c:   eb e8 00 00 ld  r31,0(r8)
c007d040:   e9 3f 00 20 ld  r9,32(r31)
c007d044:   7c 00 04 ac sync
c007d048:   81 29 40 1c lwz r9,16412(r9)
c007d04c:   0c 09 00 00 twi 0,r9,0
c007d050:   

Re: [RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Preeti U Murthy
Hi Frederic,

I apologise for the confusion. As Paul pointed out maybe the usage of
the term lapic is causing a large amount of confusion. So please see the
clarification below. Maybe it will help answer your question.

On 07/26/2013 08:09 AM, Preeti U Murthy wrote:
 Hi Frederic,
 
 On 07/25/2013 07:00 PM, Frederic Weisbecker wrote:
 On Thu, Jul 25, 2013 at 02:33:02PM +0530, Preeti U Murthy wrote:
 In the current design of timer offload framework, the broadcast cpu should
 *not* go into tickless idle so as to avoid missed wakeups on CPUs in deep 
 idle states.

 Since we prevent the CPUs entering deep idle states from programming the 
 lapic of the
 broadcast cpu for their respective next local events for reasons mentioned 
 in
 PATCH[3/5], the broadcast CPU checks if there are any CPUs to be woken up 
 during
 each of its timer interrupt programmed to its local events.

 With tickless idle, the broadcast CPU might not get a timer interrupt till 
 after
 many ticks which can result in missed wakeups on CPUs in deep idle states. 
 By
 disabling tickless idle, worst case, the tick_sched hrtimer will trigger a
 timer interrupt every period to check for broadcast.

 However the current setup of tickless idle does not let us make the choice
 of tickless on individual cpus. NOHZ_MODE_INACTIVE which disables tickless 
 idle,
 is a system wide setting. Hence resort to an arch specific call to check if 
 a cpu
 can go into tickless idle.

 Hi Preeti,

 I'm not exactly sure why you can't enter the broadcast CPU in dynticks idle 
 mode.
 I read in the previous patch that's because in dynticks idle mode the 
 broadcast
 CPU deactivates its lapic so it doesn't receive the IPI. But may be I 
 misunderstood.
 Anyway that's not good for powersaving.

Firstly, when CPUs enter deep idle states, their local clock event
devices get switched off. In the case of powerpc, local clock event
device is the decrementer. Hence such CPUs *do not get timer interrupts*
but are still *capable of taking IPIs.*

So we need to ensure that some other CPU, in this case the broadcast
CPU, makes note of when the timer interrupt of the CPU in such deep idle
states is to trigger and at that moment issue an IPI to that CPU.

*The broadcast CPU however should have its decrementer active always*,
meaning it is disallowed from entering deep idle states, where the
decrementer switches off, precisely because the other idling CPUs bank
on it for the above mentioned reason.

 *The lapic of a broadcast CPU is active always*. Say CPUX, wants the
 broadcast CPU to wake it up at timeX.  Since we cannot program the lapic
 of a remote CPU, CPUX will need to send an IPI to the broadcast CPU,
 asking it to program its lapic to fire at timeX so as to wake up CPUX.
 *With multiple CPUs the overhead of sending IPI, could result in
 performance bottlenecks and may not scale well.*

Rewording the above. The decrementer of the broadcast CPU is active
always. Since we cannot program the clock event device
of a remote CPU, CPUX will need to send an IPI to the broadcast CPU,
(which the broadcast CPU is very well capable of receiving), asking it
to program its decrementer to fire at timeX so as to wake up CPUX
*With multiple CPUs the overhead of sending IPI, could result in
performance bottlenecks and may not scale well.*

 
 Hence the workaround is that the broadcast CPU on each of its timer
 interrupt checks if any of the next timer event of a CPU in deep idle
 state has expired, which can very well be found from dev-next_event of
 that CPU. For example the timeX that has been mentioned above has
 expired. If so the broadcast handler is called to send an IPI to the
 idling CPU to wake it up.
 
 *If the broadcast CPU, is in tickless idle, its timer interrupt could be
 many ticks away. It could miss waking up a CPU in deep idle*, if its
 wakeup is much before this timer interrupt of the broadcast CPU. But
 without tickless idle, atleast at each period we are assured of a timer
 interrupt. At which time broadcast handling is done as stated in the
 previous paragraph and we will not miss wakeup of CPUs in deep idle states.
 
 Yeah it is true that not allowing the broadcast CPU to enter tickless
 idle is bad for power savings, but for the use case that we are aiming
 at in this patch series, the current approach seems to be the best, with
 minimal trade-offs in performance, power savings, scalability and no
 change in the broadcast framework that exists today in the kernel.
 

Regards
Preeti U Murthy

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: SIGSTKSZ/MINSIGSTKSZ too small on 64bit

2013-07-25 Thread Alan Modra
On Fri, Jul 26, 2013 at 12:23:25PM +1000, Anton Blanchard wrote:
 
 Hi,
 
 Alan has been looking at a glibc test fail. His analysis shows SEGVs
 in signal handlers using sigaltstack, and that MINSIGSTKSZ and SIGSTKSZ
 are too small.
 
 We increased the size of rt_sigframe in commit 2b0a576d15e0
 (powerpc: Add new transactional memory state to the signal context) but
 didn't bump either SIGSTKSZ and MINSIGSTKSZ. We need to do that in both
 the kernel and glibc, but I'm a bit worried we could have broken
 existing applications that use sigaltstack.

Before VSX changes, struct rt_sigframe size was 1920 plus 128 for
__SIGNAL_FRAMESIZE giving ppc64 exactly the default MINSIGSTKSZ of
2048.

After VSX, ucontext increased by 256 bytes.  Oops, we're over
MINSIGSTKSZ.  Add another ucontext for TM and rt_sigframe is now at
3872, giving actual MINSIGSTKSZ of 4000.

The glibc testcase that I was looking at was tst-cancel21, which
allocates 2*SIGSTKSZ (not because the test is trying to be
conservative, but because the test actually has nested signal stack
frames).  We blew the allocation by 48 bytes when using current
mainline gcc to compile glibc (le ppc64).

The required stack depth in _dl_lookup_symbol_x from the top of the
next signal frame was 10944 bytes.  I guess you'd want to add 288 to
that, implying an actual SIGSTKSZ of 11232.

I think we want
#define MINSIGSTKSZ 4096
#define SIGSTKSZ16384

frame size  r1
#0  0x295cdaec in _dl_lookup_symbol_x(memset)   190
#1  0x295d3c4c in _dl_fixup()b0 10003310160
#2  0x295dc818 in _dl_runtime_resolve()  b0 10003310210
#3  0x1f59ea8c in uw_init_context_1()   a30 100033102c0
#4  0x1f59f560 in libc:_Unwind_ForcedUnwind()   c90 10003310cf0
#5  0x1ffb9538 in pt:_Unwind_ForcedUnwind()  90 10003311980
#6  0x1ffb6418 in __pthread_unwind() 70 10003311a10
#7  0x1ffaaeb0 in sigcancel_handler()70 10003311a80
#8  signal handler called 1ffe0448 tramp  fa0 10003311af0
10003311b70 rt_sigframe
  10003311c58 sigcontext.gp_regs
  10003311dd8 sigcontext.fp_regs
  10003311ee0 sigcontext.v_regs
  10003311ef0 sigcontext.vmx
100033128d8 rt_sigframe.pinfo  offset d68
10003312968 rt_sigframe.abigap
10003312a88 end + 8 alignment
#9  0x1ffb6f9c in80 10003312a90
#10 0x1ffb6f84 in   10003312b10
#11 0x100020f4 in delete_temp_files()80 10003312dc0
#12 0x10002198 in   10003313070
#13 signal handler called
#14 0x1ffb6f9c in ?? ()
#15 0x1ffb6f84 in ?? ()
#16 0x10002274 in ?? ()
#17 0x10002430 in ?? ()
#18 0x10002644 in ?? ()
#19 0x10001a1c in ?? ()
#20 0x1fe17f0c in ?? ()
#21 0x1fe18134 in ?? ()
#22 0x in ?? ()


-- 
Alan Modra
Australia Development Lab, IBM
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[Resend RFC PATCH 0/5] cpuidle/ppc: Timer offload framework to support deep idle states

2013-07-25 Thread Preeti U Murthy
On PowerPC, when CPUs enter deep idle states, their local timers are
switched off. The responsibility of waking them up at their next timer event,
needs to be handed over to an external device. On PowerPC, we do not have an
external device equivalent to HPET, which is currently done on architectures
like x86. Instead we assign the local timer of one of the CPUs to do this
job.

This patchset is an attempt to make use of the existing timer broadcast
framework in the kernel to meet the above requirement, except that the tick
broadcast device is the local timer of the boot CPU.

This patch series is ported ontop of 3.11-rc1 + the cpuidle driver backend
for powernv posted by Deepthi Dharwar recently. The current design and
implementation supports the ONESHOT tick mode. It does not yet support
the PERIODIC tick mode. This patch is tested with NOHZ_FULL off.

Patch[1/5], Patch[2/5]: optimize the broadcast mechanism on ppc.
Patch[3/5]: Introduces the core of the timer offload framework on powerpc.
Patch[4/5]: The cpu doing the broadcast should not go into tickless idle.
Patch[5/5]: Add a deep idle state to the cpuidle state table on powernv.

Patch[5/5] is the patch that ultimately makes use of the timer offload
framework that the patches Patch[1/5] to Patch[4/5] build.

This patch series is being resent to clarify certain ambiguity in the patch
descriptions from the previous post. Discussion around this:
https://lkml.org/lkml/2013/7/25/754

---

Preeti U Murthy (3):
  cpuidle/ppc: Add timer offload framework to support deep idle states
  cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints
  cpuidle/ppc: Add longnap state to the idle states on powernv

Srivatsa S. Bhat (2):
  powerpc: Free up the IPI message slot of ipi call function 
(PPC_MSG_CALL_FUNC)
  powerpc: Implement broadcast timer interrupt as an IPI message


 arch/powerpc/include/asm/smp.h  |3 +
 arch/powerpc/include/asm/time.h |3 +
 arch/powerpc/kernel/smp.c   |   23 --
 arch/powerpc/kernel/time.c  |   86 +++
 arch/powerpc/platforms/cell/interrupt.c |2 -
 arch/powerpc/platforms/powernv/Kconfig  |1 
 arch/powerpc/platforms/powernv/processor_idle.c |   48 +
 arch/powerpc/platforms/ps3/smp.c|2 -
 kernel/time/tick-sched.c|7 ++
 9 files changed, 163 insertions(+), 12 deletions(-)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[Resend RFC PATCH 2/5] powerpc: Implement broadcast timer interrupt as an IPI message

2013-07-25 Thread Preeti U Murthy
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

For scalability and performance reasons, we want the broadcast timer
interrupts to be handled as efficiently as possible. Fixed IPI messages
are one of the most efficient mechanisms available - they are faster
than the smp_call_function mechanism because the IPI handlers are fixed
and hence they don't involve costly operations such as adding IPI handlers
to the target CPU's function queue, acquiring locks for synchronization etc.

Luckily we have an unused IPI message slot, so use that to implement
broadcast timer interrupts efficiently.

Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/include/asm/smp.h  |3 ++-
 arch/powerpc/kernel/smp.c   |   19 +++
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 4 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 51bf017..d877b69 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -117,7 +117,7 @@ extern int cpu_to_core_id(int cpu);
  *
  * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
  * in /proc/interrupts will be wrong!!! --Troy */
-#define PPC_MSG_UNUSED 0
+#define PPC_MSG_TIMER  0
 #define PPC_MSG_RESCHEDULE  1
 #define PPC_MSG_CALL_FUNC_SINGLE   2
 #define PPC_MSG_DEBUGGER_BREAK  3
@@ -190,6 +190,7 @@ extern struct smp_ops_t *smp_ops;
 
 extern void arch_send_call_function_single_ipi(int cpu);
 extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
+extern void arch_send_tick_broadcast(const struct cpumask *mask);
 
 /* Definitions relative to the secondary CPU spin loop
  * and entry point. Not all of them exist on both 32 and
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index bc41e9f..6a68ca4 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -35,6 +35,7 @@
 #include asm/ptrace.h
 #include linux/atomic.h
 #include asm/irq.h
+#include asm/hw_irq.h
 #include asm/page.h
 #include asm/pgtable.h
 #include asm/prom.h
@@ -111,9 +112,9 @@ int smp_generic_kick_cpu(int nr)
 }
 #endif /* CONFIG_PPC64 */
 
-static irqreturn_t unused_action(int irq, void *data)
+static irqreturn_t timer_action(int irq, void *data)
 {
-   /* This slot is unused and hence available for use, if needed */
+   timer_interrupt();
return IRQ_HANDLED;
 }
 
@@ -144,14 +145,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 }
 
 static irq_handler_t smp_ipi_action[] = {
-   [PPC_MSG_UNUSED] =  unused_action, /* Slot available for future use */
+   [PPC_MSG_TIMER] =  timer_action,
[PPC_MSG_RESCHEDULE] = reschedule_action,
[PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
-   [PPC_MSG_UNUSED] =  ipi unused,
+   [PPC_MSG_TIMER] =  ipi timer,
[PPC_MSG_RESCHEDULE] = ipi reschedule,
[PPC_MSG_CALL_FUNC_SINGLE] = ipi call function single,
[PPC_MSG_DEBUGGER_BREAK] = ipi debugger,
@@ -221,6 +222,8 @@ irqreturn_t smp_ipi_demux(void)
all = xchg(info-messages, 0);
 
 #ifdef __BIG_ENDIAN
+   if (all  (1  (24 - 8 * PPC_MSG_TIMER)))
+   timer_interrupt();
if (all  (1  (24 - 8 * PPC_MSG_RESCHEDULE)))
scheduler_ipi();
if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE)))
@@ -266,6 +269,14 @@ void arch_send_call_function_ipi_mask(const struct cpumask 
*mask)
do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE);
 }
 
+void arch_send_tick_broadcast(const struct cpumask *mask)
+{
+   unsigned int cpu;
+
+   for_each_cpu(cpu, mask)
+   do_message_pass(cpu, PPC_MSG_TIMER);
+}
+
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
 void smp_send_debugger_break(void)
 {
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.c
index 28166e4..1359113 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -213,7 +213,7 @@ static void iic_request_ipi(int msg)
 
 void iic_request_IPIs(void)
 {
-   iic_request_ipi(PPC_MSG_UNUSED);
+   iic_request_ipi(PPC_MSG_TIMER);
iic_request_ipi(PPC_MSG_RESCHEDULE);
iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE);
iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 488f069..5cb742a 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -74,7 +74,7 @@ static int __init ps3_smp_probe(void)
* to index needs to be setup.
*/
 
-   BUILD_BUG_ON(PPC_MSG_UNUSED 

[Resend RFC PATCH 3/5] cpuidle/ppc: Add timer offload framework to support deep idle states

2013-07-25 Thread Preeti U Murthy
On ppc, in deep idle states, the local clock event device of CPUs gets
switched off. On PowerPC, the local clock event device is called the
decrementer. Make use of the broadcast framework to issue interrupts to
cpus in deep idle states on their timer events, except that on ppc, we
do not have an external device such as HPET, but we use the decrementer
of one of the CPUs itself as the broadcast device.

Instantiate two different clock event devices, one representing the
decrementer and another representing the broadcast device for each cpu.
The cpu which registers its broadcast device will be responsible for
performing the function of issuing timer interrupts to CPUs in deep idle
states, and is referred to as the broadcast cpu in the changelogs of this
patchset for convenience. Such a CPU is not allowed to enter deep idle
states, where the decrementer is switched off.

For now, only the boot cpu's broadcast device gets registered as a clock event
device along with the decrementer. Hence this is the broadcast cpu.

On the broadcast cpu, on each timer interrupt, apart from the regular local
timer event handler the broadcast handler is also called. We avoid the overhead
of programming the decrementer specifically for a broadcast event. The reason 
is for
performance and scalability reasons. Say cpuX goes to deep idle state. It
has to ask the broadcast CPU to reprogram its(broadcast CPU's) decrementer for
the next local timer event of cpuX. cpuX can do so only by sending an IPI to the
broadcast CPU. With many more cpus going to deep idle, this model of sending
IPIs each time will result in performance bottleneck and may not scale well.

Apart from this there is no change in the way broadcast is handled today. On
a broadcast ipi the event handler for a timer interrupt is called on the cpu
in deep idle state to handle the local events.

The current design and implementation of the timer offload framework supports
the ONESHOT tick mode but not the PERIODIC mode.

Signed-off-by: Preeti U. Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/include/asm/time.h|3 +
 arch/powerpc/kernel/smp.c  |4 +-
 arch/powerpc/kernel/time.c |   81 
 arch/powerpc/platforms/powernv/Kconfig |1 
 4 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/time.h b/arch/powerpc/include/asm/time.h
index c1f2676..936be0d 100644
--- a/arch/powerpc/include/asm/time.h
+++ b/arch/powerpc/include/asm/time.h
@@ -24,14 +24,17 @@ extern unsigned long tb_ticks_per_jiffy;
 extern unsigned long tb_ticks_per_usec;
 extern unsigned long tb_ticks_per_sec;
 extern struct clock_event_device decrementer_clockevent;
+extern struct clock_event_device broadcast_clockevent;
 
 struct rtc_time;
 extern void to_tm(int tim, struct rtc_time * tm);
 extern void GregorianDay(struct rtc_time *tm);
+extern void decrementer_timer_interrupt(void);
 
 extern void generic_calibrate_decr(void);
 
 extern void set_dec_cpu6(unsigned int val);
+extern int bc_cpu;
 
 /* Some sane defaults: 125 MHz timebase, 1GHz processor */
 extern unsigned long ppc_proc_freq;
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 6a68ca4..d3b7014 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -114,7 +114,7 @@ int smp_generic_kick_cpu(int nr)
 
 static irqreturn_t timer_action(int irq, void *data)
 {
-   timer_interrupt();
+   decrementer_timer_interrupt();
return IRQ_HANDLED;
 }
 
@@ -223,7 +223,7 @@ irqreturn_t smp_ipi_demux(void)
 
 #ifdef __BIG_ENDIAN
if (all  (1  (24 - 8 * PPC_MSG_TIMER)))
-   timer_interrupt();
+   decrementer_timer_interrupt();
if (all  (1  (24 - 8 * PPC_MSG_RESCHEDULE)))
scheduler_ipi();
if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE)))
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 65ab9e9..7e858e1 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -42,6 +42,7 @@
 #include linux/timex.h
 #include linux/kernel_stat.h
 #include linux/time.h
+#include linux/timer.h
 #include linux/init.h
 #include linux/profile.h
 #include linux/cpu.h
@@ -97,8 +98,11 @@ static struct clocksource clocksource_timebase = {
 
 static int decrementer_set_next_event(unsigned long evt,
  struct clock_event_device *dev);
+static int broadcast_set_next_event(unsigned long evt,
+ struct clock_event_device *dev);
 static void decrementer_set_mode(enum clock_event_mode mode,
 struct clock_event_device *dev);
+static void decrementer_timer_broadcast(const struct cpumask *mask);
 
 struct clock_event_device decrementer_clockevent = {
.name   = decrementer,
@@ -106,13 +110,26 @@ struct clock_event_device decrementer_clockevent = {
.irq= 

[Resend RFC PATCH 5/5] cpuidle/ppc: Add longnap state to the idle states on powernv

2013-07-25 Thread Preeti U Murthy
This patch hooks into the existing broadcast framework with the support that 
this
patchset introduces for ppc, and the cpuidle driver backend
for powernv(posted out recently by Deepthi Dharwar) to add sleep state as
one of the deep idle states, in which the decrementer is switched off.

However in this patch, we only emulate sleep by going into a state which does
a nap with the decrementer interrupts disabled, termed as longnap. This enables
focus on the timer broadcast framework for ppc in this series of patches ,
which is required as a first step to enable sleep on ppc.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/platforms/powernv/processor_idle.c |   48 +++
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/processor_idle.c 
b/arch/powerpc/platforms/powernv/processor_idle.c
index f43ad91a..9aca502 100644
--- a/arch/powerpc/platforms/powernv/processor_idle.c
+++ b/arch/powerpc/platforms/powernv/processor_idle.c
@@ -9,16 +9,18 @@
 #include linux/cpuidle.h
 #include linux/cpu.h
 #include linux/notifier.h
+#include linux/clockchips.h
 
 #include asm/machdep.h
 #include asm/runlatch.h
+#include asm/time.h
 
 struct cpuidle_driver powernv_idle_driver = {
.name = powernv_idle,
.owner =THIS_MODULE,
 };
 
-#define MAX_IDLE_STATE_COUNT   2
+#define MAX_IDLE_STATE_COUNT   3
 
 static int max_idle_state = MAX_IDLE_STATE_COUNT - 1;
 static struct cpuidle_device __percpu *powernv_cpuidle_devices;
@@ -54,6 +56,43 @@ static int nap_loop(struct cpuidle_device *dev,
return index;
 }
 
+/* Emulate sleep, with long nap.
+ * During sleep, the core does not receive decrementer interrupts.
+ * Emulate sleep using long nap with decrementers interrupts disabled.
+ * This is an initial prototype to test the timer offload framework for ppc.
+ * We will eventually introduce the sleep state once the timer offload 
framework
+ * for ppc is stable.
+ */
+static int longnap_loop(struct cpuidle_device *dev,
+   struct cpuidle_driver *drv,
+   int index)
+{
+   int cpu = dev-cpu;
+
+   unsigned long lpcr = mfspr(SPRN_LPCR);
+
+   lpcr = ~(LPCR_MER | LPCR_PECE); /* lpcr[mer] must be 0 */
+
+   /* exit powersave upon external interrupt, but not decrementer
+* interrupt, Emulate sleep.
+*/
+   lpcr |= LPCR_PECE0;
+
+   if (cpu != bc_cpu) {
+   mtspr(SPRN_LPCR, lpcr);
+   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, cpu);
+   power7_nap();
+   clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, cpu);
+   } else {
+   /* Wakeup on a decrementer interrupt, Do a nap */
+   lpcr |= LPCR_PECE1;
+   mtspr(SPRN_LPCR, lpcr);
+   power7_nap();
+   }
+
+   return index;
+}
+
 /*
  * States for dedicated partition case.
  */
@@ -72,6 +111,13 @@ static struct cpuidle_state 
powernv_states[MAX_IDLE_STATE_COUNT] = {
.exit_latency = 10,
.target_residency = 100,
.enter = nap_loop },
+{ /* LongNap */
+   .name = LongNap,
+   .desc = LongNap,
+   .flags = CPUIDLE_FLAG_TIME_VALID,
+   .exit_latency = 10,
+   .target_residency = 100,
+   .enter = longnap_loop },
 };
 
 static int powernv_cpuidle_add_cpu_notifier(struct notifier_block *n,

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[Resend RFC PATCH 1/5] powerpc: Free up the IPI message slot of ipi call function (PPC_MSG_CALL_FUNC)

2013-07-25 Thread Preeti U Murthy
From: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com

The IPI handlers for both PPC_MSG_CALL_FUNC and PPC_MSG_CALL_FUNC_SINGLE
map to a common implementation - generic_smp_call_function_single_interrupt().
So, we can consolidate them and save one of the IPI message slots, (which are
precious, since only 4 of those slots are available).

So, implement the functionality of PPC_MSG_CALL_FUNC using
PPC_MSG_CALL_FUNC_SINGLE itself and release its IPI message slot, so that it
can be used for something else in the future, if desired.

Signed-off-by: Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com
Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/include/asm/smp.h  |2 +-
 arch/powerpc/kernel/smp.c   |   12 +---
 arch/powerpc/platforms/cell/interrupt.c |2 +-
 arch/powerpc/platforms/ps3/smp.c|2 +-
 4 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index ffbaabe..51bf017 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -117,7 +117,7 @@ extern int cpu_to_core_id(int cpu);
  *
  * Make sure this matches openpic_request_IPIs in open_pic.c, or what shows up
  * in /proc/interrupts will be wrong!!! --Troy */
-#define PPC_MSG_CALL_FUNCTION   0
+#define PPC_MSG_UNUSED 0
 #define PPC_MSG_RESCHEDULE  1
 #define PPC_MSG_CALL_FUNC_SINGLE   2
 #define PPC_MSG_DEBUGGER_BREAK  3
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 38b0ba6..bc41e9f 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -111,9 +111,9 @@ int smp_generic_kick_cpu(int nr)
 }
 #endif /* CONFIG_PPC64 */
 
-static irqreturn_t call_function_action(int irq, void *data)
+static irqreturn_t unused_action(int irq, void *data)
 {
-   generic_smp_call_function_interrupt();
+   /* This slot is unused and hence available for use, if needed */
return IRQ_HANDLED;
 }
 
@@ -144,14 +144,14 @@ static irqreturn_t debug_ipi_action(int irq, void *data)
 }
 
 static irq_handler_t smp_ipi_action[] = {
-   [PPC_MSG_CALL_FUNCTION] =  call_function_action,
+   [PPC_MSG_UNUSED] =  unused_action, /* Slot available for future use */
[PPC_MSG_RESCHEDULE] = reschedule_action,
[PPC_MSG_CALL_FUNC_SINGLE] = call_function_single_action,
[PPC_MSG_DEBUGGER_BREAK] = debug_ipi_action,
 };
 
 const char *smp_ipi_name[] = {
-   [PPC_MSG_CALL_FUNCTION] =  ipi call function,
+   [PPC_MSG_UNUSED] =  ipi unused,
[PPC_MSG_RESCHEDULE] = ipi reschedule,
[PPC_MSG_CALL_FUNC_SINGLE] = ipi call function single,
[PPC_MSG_DEBUGGER_BREAK] = ipi debugger,
@@ -221,8 +221,6 @@ irqreturn_t smp_ipi_demux(void)
all = xchg(info-messages, 0);
 
 #ifdef __BIG_ENDIAN
-   if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNCTION)))
-   generic_smp_call_function_interrupt();
if (all  (1  (24 - 8 * PPC_MSG_RESCHEDULE)))
scheduler_ipi();
if (all  (1  (24 - 8 * PPC_MSG_CALL_FUNC_SINGLE)))
@@ -265,7 +263,7 @@ void arch_send_call_function_ipi_mask(const struct cpumask 
*mask)
unsigned int cpu;
 
for_each_cpu(cpu, mask)
-   do_message_pass(cpu, PPC_MSG_CALL_FUNCTION);
+   do_message_pass(cpu, PPC_MSG_CALL_FUNC_SINGLE);
 }
 
 #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC)
diff --git a/arch/powerpc/platforms/cell/interrupt.c 
b/arch/powerpc/platforms/cell/interrupt.c
index 2d42f3b..28166e4 100644
--- a/arch/powerpc/platforms/cell/interrupt.c
+++ b/arch/powerpc/platforms/cell/interrupt.c
@@ -213,7 +213,7 @@ static void iic_request_ipi(int msg)
 
 void iic_request_IPIs(void)
 {
-   iic_request_ipi(PPC_MSG_CALL_FUNCTION);
+   iic_request_ipi(PPC_MSG_UNUSED);
iic_request_ipi(PPC_MSG_RESCHEDULE);
iic_request_ipi(PPC_MSG_CALL_FUNC_SINGLE);
iic_request_ipi(PPC_MSG_DEBUGGER_BREAK);
diff --git a/arch/powerpc/platforms/ps3/smp.c b/arch/powerpc/platforms/ps3/smp.c
index 4b35166..488f069 100644
--- a/arch/powerpc/platforms/ps3/smp.c
+++ b/arch/powerpc/platforms/ps3/smp.c
@@ -74,7 +74,7 @@ static int __init ps3_smp_probe(void)
* to index needs to be setup.
*/
 
-   BUILD_BUG_ON(PPC_MSG_CALL_FUNCTION!= 0);
+   BUILD_BUG_ON(PPC_MSG_UNUSED   != 0);
BUILD_BUG_ON(PPC_MSG_RESCHEDULE   != 1);
BUILD_BUG_ON(PPC_MSG_CALL_FUNC_SINGLE != 2);
BUILD_BUG_ON(PPC_MSG_DEBUGGER_BREAK   != 3);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [V2 2/2] powerpc/512x: add LocalPlus Bus FIFO device driver

2013-07-25 Thread Alexander Popov
Thanks, Gerhard.

I'll improve the code and return with the third version.

Best regards,
Alexander.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[Resend RFC PATCH 4/5] cpuidle/ppc: CPU goes tickless if there are no arch-specific constraints

2013-07-25 Thread Preeti U Murthy
In the current design of timer offload framework, the broadcast cpu should
*not* go into tickless idle so as to avoid missed wakeups on CPUs in deep idle 
states.

Since we prevent the CPUs entering deep idle states from programming the
decrementer of the broadcast cpu for their respective next local events for
reasons mentioned in PATCH[3/5], the broadcast CPU checks if there are any
CPUs to be woken up during each of its timer interrupt, which is programmed
to its local events.

With tickless idle, the broadcast CPU might not have a timer interrupt
pending till after many ticks, which can result in missed wakeups on CPUs
in deep idle states. By disabling tickless idle, worst case, the tick_sched
hrtimer will trigger a timer interrupt every period.

However the current setup of tickless idle does not let us make the choice
of tickless on individual cpus. NOHZ_MODE_INACTIVE which disables tickless idle,
is a system wide setting. Hence resort to an arch specific call to check if a 
cpu
can go into tickless idle.

Signed-off-by: Preeti U Murthy pre...@linux.vnet.ibm.com
---

 arch/powerpc/kernel/time.c |5 +
 kernel/time/tick-sched.c   |7 +++
 2 files changed, 12 insertions(+)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 7e858e1..916c32f 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -864,6 +864,11 @@ static void decrementer_timer_broadcast(const struct 
cpumask *mask)
arch_send_tick_broadcast(mask);
 }
 
+int arch_can_stop_idle_tick(int cpu)
+{
+   return cpu != bc_cpu;
+}
+
 static void register_decrementer_clockevent(int cpu)
 {
struct clock_event_device *dec = per_cpu(decrementers, cpu);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6960172..e9ffa84 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -700,8 +700,15 @@ static void tick_nohz_full_stop_tick(struct tick_sched *ts)
 #endif
 }
 
+int __weak arch_can_stop_idle_tick(int cpu)
+{
+   return 1;
+}
+
 static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
 {
+   if (!arch_can_stop_idle_tick(cpu))
+   return false;
/*
 * If this cpu is offline and it is the one which updates
 * jiffies, then give up the assignment and let it be taken by

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 1/4] powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN

2013-07-25 Thread Bharat Bhushan
For booke3e _PAGE_ENDIAN is not defined. Infact what is defined
is _PAGE_LENDIAN which is wrong and that should be _PAGE_ENDIAN.
There are no compilation errors as
arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0
as it is not defined anywhere.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/include/asm/pte-book3e.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-book3e.h 
b/arch/powerpc/include/asm/pte-book3e.h
index 0156702..576ad88 100644
--- a/arch/powerpc/include/asm/pte-book3e.h
+++ b/arch/powerpc/include/asm/pte-book3e.h
@@ -40,7 +40,7 @@
 #define _PAGE_U1   0x01
 #define _PAGE_U0   0x02
 #define _PAGE_ACCESSED 0x04
-#define _PAGE_LENDIAN  0x08
+#define _PAGE_ENDIAN   0x08
 #define _PAGE_GUARDED  0x10
 #define _PAGE_COHERENT 0x20 /* M: enforce memory coherence */
 #define _PAGE_NO_CACHE 0x40 /* I: cache inhibit */
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 3/4] kvm: powerpc: allow guest control G attribute in mas2

2013-07-25 Thread Bharat Bhushan
G bit in MAS2 indicates whether the page is Guarded.
There is no reason to stop guest setting  G, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index 277cb18..4fd9650 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1 | MAS2_E)
+ (MAS2_X0 | MAS2_X1 | MAS2_E | MAS2_G)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 2/4] kvm: powerpc: allow guest control E attribute in mas2

2013-07-25 Thread Bharat Bhushan
E bit in MAS2 bit indicates whether the page is accessed
in Little-Endian or Big-Endian byte order.
There is no reason to stop guest setting  E, so allow him.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h
index c2e5e98..277cb18 100644
--- a/arch/powerpc/kvm/e500.h
+++ b/arch/powerpc/kvm/e500.h
@@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct 
kvm_vcpu *vcpu)
 #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW)
 #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW)
 #define MAS2_ATTRIB_MASK \
- (MAS2_X0 | MAS2_X1)
+ (MAS2_X0 | MAS2_X1 | MAS2_E)
 #define MAS3_ATTRIB_MASK \
  (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \
   | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK)
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 4/4] kvm: powerpc: set cache coherency only for RAM pages

2013-07-25 Thread Bharat Bhushan
If the page is RAM then map this as cacheable and coherent (set M bit)
otherwise this page is treated as I/O and map this as cache inhibited
and guarded (set  I + G)

This helps setting proper MMU mapping for direct assigned device.

NOTE: There can be devices that require cacheable mapping, which is not yet 
supported.

Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com
---
 arch/powerpc/kvm/e500_mmu_host.c |   24 +++-
 1 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c
index 1c6a9d7..5cbdc8f 100644
--- a/arch/powerpc/kvm/e500_mmu_host.c
+++ b/arch/powerpc/kvm/e500_mmu_host.c
@@ -64,13 +64,27 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int 
usermode)
return mas3;
 }
 
-static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode)
+static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn)
 {
+   u32 mas2_attr;
+
+   mas2_attr = mas2  MAS2_ATTRIB_MASK;
+
+   if (kvm_is_mmio_pfn(pfn)) {
+   /*
+* If page is not RAM then it is treated as I/O page.
+* Map it with cache inhibited and guarded (set I + G).
+*/
+   mas2_attr |= MAS2_I | MAS2_G;
+   return mas2_attr;
+   }
+
+   /* Map RAM pages as cacheable (Not setting I in MAS2) */
 #ifdef CONFIG_SMP
-   return (mas2  MAS2_ATTRIB_MASK) | MAS2_M;
-#else
-   return mas2  MAS2_ATTRIB_MASK;
+   /* Also map as coherent (set M) in SMP */
+   mas2_attr |= MAS2_M;
 #endif
+   return mas2_attr;
 }
 
 /*
@@ -313,7 +327,7 @@ static void kvmppc_e500_setup_stlbe(
/* Force IPROT=0 for all guest mappings. */
stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID;
stlbe-mas2 = (gvaddr  MAS2_EPN) |
- e500_shadow_mas2_attrib(gtlbe-mas2, pr);
+ e500_shadow_mas2_attrib(gtlbe-mas2, pfn);
stlbe-mas7_3 = ((u64)pfn  PAGE_SHIFT) |
e500_shadow_mas3_attrib(gtlbe-mas7_3, pr);
 
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev