Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
On Thu April 5 2007 6:04 pm, Benjamin Herrenschmidt wrote:
> On Thu, 2007-04-05 at 14:55 -0500, Kevin Corry wrote:
> > First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for
> > a function called smp_call_function_single(). However, this routine is
> > only implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently
> > needs to call this to run a function on a specific CPU. Powerpc provides
> > an smp_call_function() routine to run a function on all active CPUs, so I
> > used that as a basis to add an smp_call_function_single() routine. I've
> > included the patch below and was wondering if it looked like a sane
> > approach.
>
> We should do better... it will require some backend work for the various
> supported PICs though. I've always wanted to look into doing a 
> smp_call_function_cpumask in fact :-)

I was actually wondering about that myself today. It would seem like an 
smp_call_function() that takes a CPU mask would be much more flexible than 
either the current version or the new one that I proposed. However, that was 
a little more hacking that I was willing to do today on powerpc architecture 
code. :)

> > Next, we ran into a problem related to Perfmon2 initialization and sysfs.
> > The problem turned out to be that the powerpc version of topology_init()
> > is defined as an __initcall() routine, but Perfmon2's initialization is
> > done as a subsys_initcall() routine. Thus, Perfmon2 tries to initialize
> > its sysfs information before some of the powerpc cpu information has been
> > initialized. However, on all other architectures, topology_init() is
> > defined as a subsys_initcall() routine, so this problem was not seen on
> > any other platforms. Changing the powerpc version of topology_init() to a
> > subsys_initcall() seems to have fixed the bug. However, I'm not sure if
> > that is going to cause problems elsewhere in the powerpc code. I've
> > included the patch below (after the smp-call-function-single patch). Does
> > anyone know if this change is safe, or if there was a specific reason
> > that topology_init() was left as an __initcall() on powerpc?
>
> It would make sense to follow what other archs do. Note that if both
> perfmon and topology_init are subsys_initcall, that is on the same
> level, it's still a bit hairy to expect one to be called before the
> other...

I wondered that as well, but based on what Arnd posted earlier (presumably 
about the kernel linking order), the topology_init() call, which is in the 
arch/ top-level directory, should occur before pfm_init(), which is in 
perfmon/, even if both are in the same initcall level.

Thanks,
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
On Thu April 5 2007 3:32 pm, Kevin Corry wrote:
> On Thu April 5 2007 3:08 pm, Arnd Bergmann wrote:
> > On Thursday 05 April 2007, Kevin Corry wrote:
> > > First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h
> > > for a function called smp_call_function_single(). However, this routine
> > > is only implemented on i386, x86_64, ia64, and mips. Perfmon2
> > > apparently needs to call this to run a function on a specific CPU.
> > > Powerpc provides an smp_call_function() routine to run a function on
> > > all active CPUs, so I used that as a basis to add an
> > > smp_call_function_single() routine. I've included the patch below and
> > > was wondering if it looked like a sane approach.
> >
> > The function itself looks good, but since it's very similar to the
> > existing smp_call_function(), you should probably try to share some of
> > the code, e.g. by making a helper function that gets an argument to
> > decide whether to run on a specific CPU or on all CPUs.
>
> Ok. I'll see what I can come up with and post another patch today or
> tomorrow.

Here's a new version that adds smp_call_function_single(), and moves the
code that's shared with smp_call_function() to __smp_call_function().

Thanks,
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/


Add an smp_call_function_single() to the powerpc architecture. Since this
is very similar to the existing smp_call_function() routine, the common
portions have been split out into __smp_call_function(). Since the
spin_lock(_lock) was moved to __smp_call_function(),
smp_call_function() now explicitly calls preempt_disable() before getting
the count of online CPUs.

Signed-off-by: Kevin Corry <[EMAIL PROTECTED]>

Index: linux-2.6.20-arnd3-perfmon/arch/powerpc/kernel/smp.c
===
--- linux-2.6.20-arnd3-perfmon.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6.20-arnd3-perfmon/arch/powerpc/kernel/smp.c
@@ -198,26 +198,11 @@ static struct call_data_struct {
 /* delay of at least 8 seconds */
 #define SMP_CALL_TIMEOUT   8
 
-/*
- * This function sends a 'generic call function' IPI to all other CPUs
- * in the system.
- *
- * [SUMMARY] Run a function on all other CPUs.
- *  The function to run. This must be fast and non-blocking.
- *  An arbitrary pointer to pass to the function.
- *  currently unused.
- *  If true, wait (atomically) until function has completed on other 
CPUs.
- * [RETURNS] 0 on success, else a negative status code. Does not return until
- * remote CPUs are nearly ready to execute <> or are or have executed.
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- */
-int smp_call_function (void (*func) (void *info), void *info, int nonatomic,
-  int wait)
-{ 
+static int __smp_call_function(void (*func)(void *info), void *info,
+  int wait, int target_cpu, int num_cpus)
+{
struct call_data_struct data;
-   int ret = -1, cpus;
+   int ret = -1;
u64 timeout;
 
/* Can deadlock when called with interrupts disabled */
@@ -234,40 +219,33 @@ int smp_call_function (void (*func) (voi
atomic_set(, 0);
 
spin_lock(_lock);
-   /* Must grab online cpu count with preempt disabled, otherwise
-* it can change. */
-   cpus = num_online_cpus() - 1;
-   if (!cpus) {
-   ret = 0;
-   goto out;
-   }
 
call_data = 
smp_wmb();
/* Send a message to all other CPUs and wait for them to respond */
-   smp_ops->message_pass(MSG_ALL_BUT_SELF, PPC_MSG_CALL_FUNCTION);
+   smp_ops->message_pass(target_cpu, PPC_MSG_CALL_FUNCTION);
 
timeout = get_tb() + (u64) SMP_CALL_TIMEOUT * tb_ticks_per_sec;
 
/* Wait for response */
-   while (atomic_read() != cpus) {
+   while (atomic_read() != num_cpus) {
HMT_low();
if (get_tb() >= timeout) {
-   printk("smp_call_function on cpu %d: other cpus not "
-  "responding (%d)\n", smp_processor_id(),
-  atomic_read());
+   printk("%s on cpu %d: other cpus not "
+  "responding (%d)\n", __FUNCTION__,
+  smp_processor_id(), atomic_read());
debugger(NULL);
goto out;
}
}
 
if (wait) {
-   while (atomic_read() != cpus) {
+   while (atomic_read() != num_cpus) {
HMT_low();
if (get_tb() >= timeout) {
-   printk("smp_call_function on cpu %d: other "
-  "cpus not finishing (%d/%d)\n",
-  smp_processor_id(),
+   printk("%s on cpu %d: other 

Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Benjamin Herrenschmidt
On Thu, 2007-04-05 at 14:55 -0500, Kevin Corry wrote:
> Hello,
> 
> Carl Love and I have been working on getting the latest perfmon2 patches 
> (http://perfmon2.sourceforge.net/) working on Cell, and on powerpc in 
> general. We've come up with some powerpc-specific questions and we're hoping 
> to get some opinions from the powerpc kernel developers.
> 
> First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for a 
> function called smp_call_function_single(). However, this routine is only 
> implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently needs to 
> call this to run a function on a specific CPU. Powerpc provides an 
> smp_call_function() routine to run a function on all active CPUs, so I used 
> that as a basis to add an smp_call_function_single() routine. I've included 
> the patch below and was wondering if it looked like a sane approach.

We should do better... it will require some backend work for the various
supported PICs though. I've always wanted to look into doing a
smp_call_function_cpumask in fact :-)

> Next, we ran into a problem related to Perfmon2 initialization and sysfs. The 
> problem turned out to be that the powerpc version of topology_init() is 
> defined as an __initcall() routine, but Perfmon2's initialization is done as 
> a subsys_initcall() routine. Thus, Perfmon2 tries to initialize its sysfs 
> information before some of the powerpc cpu information has been initialized. 
> However, on all other architectures, topology_init() is defined as a 
> subsys_initcall() routine, so this problem was not seen on any other 
> platforms. Changing the powerpc version of topology_init() to a 
> subsys_initcall() seems to have fixed the bug. However, I'm not sure if that 
> is going to cause problems elsewhere in the powerpc code. I've included the 
> patch below (after the smp-call-function-single patch). Does anyone know if 
> this change is safe, or if there was a specific reason that topology_init() 
> was left as an __initcall() on powerpc?

It would make sense to follow what other archs do. Note that if both
perfmon and topology_init are subsys_initcall, that is on the same
level, it's still a bit hairy to expect one to be called before the
other...

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Arnd Bergmann
On Thursday 05 April 2007, Kevin Corry wrote:
> For the moment, I made the change to topology_init() since it was the 
> simplest 
> fix to get things working. I have considered switching the perfmon2 
> initialization to __initcall(), but there are apparently some timing issues 
> with ensuring that the perfmon2 core code is initialized before any of its 
> sub-modules. Since they could all be compiled statically in the kernel, I'm 
> not sure if there's a way to ensure the ordering of calls within a single 
> initcall level. I'll need to ask Stephane if there were any other reasons why 
> subsys_initcall() was used for perfmon2.

If they all come from the same directory, you can simply order them in
the Makefile. If a module in arch/ needs to be initialized after one in
drivers/, that's not possible though, and changing topology_init() should
be the best option.

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
On Thu April 5 2007 3:08 pm, Arnd Bergmann wrote:
> On Thursday 05 April 2007, Kevin Corry wrote:
> > First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for
> > a function called smp_call_function_single(). However, this routine is
> > only implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently
> > needs to call this to run a function on a specific CPU. Powerpc provides
> > an smp_call_function() routine to run a function on all active CPUs, so I
> > used that as a basis to add an smp_call_function_single() routine. I've
> > included the patch below and was wondering if it looked like a sane
> > approach.
>
> The function itself looks good, but since it's very similar to the existing
> smp_call_function(), you should probably try to share some of the code,
> e.g. by making a helper function that gets an argument to decide whether
> to run on a specific CPU or on all CPUs.

Ok. I'll see what I can come up with and post another patch today or tomorrow.


> > Next, we ran into a problem related to Perfmon2 initialization and sysfs.
> > The problem turned out to be that the powerpc version of topology_init()
> > is defined as an __initcall() routine, but Perfmon2's initialization is
> > done as a subsys_initcall() routine. Thus, Perfmon2 tries to initialize
> > its sysfs information before some of the powerpc cpu information has been
> > initialized. However, on all other architectures, topology_init() is
> > defined as a subsys_initcall() routine, so this problem was not seen on
> > any other platforms. Changing the powerpc version of topology_init() to a
> > subsys_initcall() seems to have fixed the bug. However, I'm not sure if
> > that is going to cause problems elsewhere in the powerpc code. I've
> > included the patch below (after the smp-call-function-single patch). Does
> > anyone know if this change is safe, or if there was a specific reason
> > that topology_init() was left as an __initcall() on powerpc?
>
> In general, it's better to do initcalls as late as possible, so
> __initcall() is preferred over subsys_initcall() if both work. Have you
> tried doing it the other way and starting perfmon2 from a regular
> __initcall()?

For the moment, I made the change to topology_init() since it was the simplest 
fix to get things working. I have considered switching the perfmon2 
initialization to __initcall(), but there are apparently some timing issues 
with ensuring that the perfmon2 core code is initialized before any of its 
sub-modules. Since they could all be compiled statically in the kernel, I'm 
not sure if there's a way to ensure the ordering of calls within a single 
initcall level. I'll need to ask Stephane if there were any other reasons why 
subsys_initcall() was used for perfmon2.

Thanks, Arnd.
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Arnd Bergmann
On Thursday 05 April 2007, Kevin Corry wrote:

> First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for a 
> function called smp_call_function_single(). However, this routine is only 
> implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently needs to 
> call this to run a function on a specific CPU. Powerpc provides an 
> smp_call_function() routine to run a function on all active CPUs, so I used 
> that as a basis to add an smp_call_function_single() routine. I've included 
> the patch below and was wondering if it looked like a sane approach.

The function itself looks good, but since it's very similar to the existing
smp_call_function(), you should probably try to share some of the code,
e.g. by making a helper function that gets an argument to decide whether
to run on a specific CPU or on all CPUs.

> Next, we ran into a problem related to Perfmon2 initialization and sysfs. The 
> problem turned out to be that the powerpc version of topology_init() is 
> defined as an __initcall() routine, but Perfmon2's initialization is done as 
> a subsys_initcall() routine. Thus, Perfmon2 tries to initialize its sysfs 
> information before some of the powerpc cpu information has been initialized. 
> However, on all other architectures, topology_init() is defined as a 
> subsys_initcall() routine, so this problem was not seen on any other 
> platforms. Changing the powerpc version of topology_init() to a 
> subsys_initcall() seems to have fixed the bug. However, I'm not sure if that 
> is going to cause problems elsewhere in the powerpc code. I've included the 
> patch below (after the smp-call-function-single patch). Does anyone know if 
> this change is safe, or if there was a specific reason that topology_init() 
> was left as an __initcall() on powerpc?

In general, it's better to do initcalls as late as possible, so __initcall()
is preferred over subsys_initcall() if both work. Have you tried doing it
the other way and starting perfmon2 from a regular __initcall()?

Arnd <><
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
Hello,

Carl Love and I have been working on getting the latest perfmon2 patches 
(http://perfmon2.sourceforge.net/) working on Cell, and on powerpc in 
general. We've come up with some powerpc-specific questions and we're hoping 
to get some opinions from the powerpc kernel developers.

First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for a 
function called smp_call_function_single(). However, this routine is only 
implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently needs to 
call this to run a function on a specific CPU. Powerpc provides an 
smp_call_function() routine to run a function on all active CPUs, so I used 
that as a basis to add an smp_call_function_single() routine. I've included 
the patch below and was wondering if it looked like a sane approach.

Next, we ran into a problem related to Perfmon2 initialization and sysfs. The 
problem turned out to be that the powerpc version of topology_init() is 
defined as an __initcall() routine, but Perfmon2's initialization is done as 
a subsys_initcall() routine. Thus, Perfmon2 tries to initialize its sysfs 
information before some of the powerpc cpu information has been initialized. 
However, on all other architectures, topology_init() is defined as a 
subsys_initcall() routine, so this problem was not seen on any other 
platforms. Changing the powerpc version of topology_init() to a 
subsys_initcall() seems to have fixed the bug. However, I'm not sure if that 
is going to cause problems elsewhere in the powerpc code. I've included the 
patch below (after the smp-call-function-single patch). Does anyone know if 
this change is safe, or if there was a specific reason that topology_init() 
was left as an __initcall() on powerpc?

Thanks for your help!
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/


===
Add an smp_call_function_single() to the powerpc architecture. This is mostly
a copy of smp_call_function(), but with minor modifications to call only the
specified CPU.

Index: linux-2.6.20-perfmon/arch/powerpc/kernel/smp.c
===
--- linux-2.6.20-perfmon.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6.20-perfmon/arch/powerpc/kernel/smp.c
@@ -287,6 +287,92 @@ int smp_call_function (void (*func) (voi
 
 EXPORT_SYMBOL(smp_call_function);
 
+/*
+ * This function sends a 'generic call function' IPI to the specified CPU.
+ *
+ * [SUMMARY] Run a function on the specified CPUs.
+ *  The function to run. This must be fast and non-blocking.
+ *  An arbitrary pointer to pass to the function.
+ *  currently unused.
+ *  If true, wait (atomically) until function has completed on the
+ * other CPU.
+ * [RETURNS] 0 on success, else a negative status code. Does not return until
+ * remote CPU is nearly ready to execute <> or are or has executed.
+ *
+ * You must not call this function with disabled interrupts or from a
+ * hardware interrupt handler or from a bottom half handler.
+ */
+int smp_call_function_single(int cpuid, void (*func)(void *info), void *info,
+int nonatomic, int wait)
+{
+   struct call_data_struct data;
+   int ret = -1, cpus = 1, me;
+   u64 timeout;
+
+   /* Can deadlock when called with interrupts disabled */
+   WARN_ON(irqs_disabled());
+
+   if (unlikely(smp_ops == NULL))
+   return -1;
+
+   me = get_cpu(); /* prevent preemption and reschedule on another 
processor */
+   if (cpuid == me) {
+   printk(KERN_INFO "%s: trying to call self\n", __FUNCTION__);
+   put_cpu();
+   return -EBUSY;
+   }
+
+   data.func = func;
+   data.info = info;
+   atomic_set(, 0);
+   data.wait = wait;
+   if (wait)
+   atomic_set(, 0);
+
+   spin_lock(_lock);
+
+   call_data = 
+   smp_wmb();
+   /* Send a message to the specified CPU and wait for it to respond */
+   smp_ops->message_pass(cpuid, PPC_MSG_CALL_FUNCTION);
+
+   timeout = get_tb() + (u64) SMP_CALL_TIMEOUT * tb_ticks_per_sec;
+
+   /* Wait for response */
+   while (atomic_read() != cpus) {
+   HMT_low();
+   if (get_tb() >= timeout) {
+   printk("%s on cpu %d: cpu %d not responding\n",
+  __FUNCTION__, smp_processor_id(), cpuid);
+   debugger(NULL);
+   goto out;
+   }
+   }
+
+   if (wait) {
+   while (atomic_read() != cpus) {
+   HMT_low();
+   if (get_tb() >= timeout) {
+   printk("%s on cpu %d: cpu %d not finishing\n",
+  __FUNCTION__, smp_processor_id(), cpuid);
+   debugger(NULL);
+   goto out;
+   }
+   }
+   }
+
+

Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
Hello,

Carl Love and I have been working on getting the latest perfmon2 patches 
(http://perfmon2.sourceforge.net/) working on Cell, and on powerpc in 
general. We've come up with some powerpc-specific questions and we're hoping 
to get some opinions from the powerpc kernel developers.

First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for a 
function called smp_call_function_single(). However, this routine is only 
implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently needs to 
call this to run a function on a specific CPU. Powerpc provides an 
smp_call_function() routine to run a function on all active CPUs, so I used 
that as a basis to add an smp_call_function_single() routine. I've included 
the patch below and was wondering if it looked like a sane approach.

Next, we ran into a problem related to Perfmon2 initialization and sysfs. The 
problem turned out to be that the powerpc version of topology_init() is 
defined as an __initcall() routine, but Perfmon2's initialization is done as 
a subsys_initcall() routine. Thus, Perfmon2 tries to initialize its sysfs 
information before some of the powerpc cpu information has been initialized. 
However, on all other architectures, topology_init() is defined as a 
subsys_initcall() routine, so this problem was not seen on any other 
platforms. Changing the powerpc version of topology_init() to a 
subsys_initcall() seems to have fixed the bug. However, I'm not sure if that 
is going to cause problems elsewhere in the powerpc code. I've included the 
patch below (after the smp-call-function-single patch). Does anyone know if 
this change is safe, or if there was a specific reason that topology_init() 
was left as an __initcall() on powerpc?

Thanks for your help!
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/


===
Add an smp_call_function_single() to the powerpc architecture. This is mostly
a copy of smp_call_function(), but with minor modifications to call only the
specified CPU.

Index: linux-2.6.20-perfmon/arch/powerpc/kernel/smp.c
===
--- linux-2.6.20-perfmon.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6.20-perfmon/arch/powerpc/kernel/smp.c
@@ -287,6 +287,92 @@ int smp_call_function (void (*func) (voi
 
 EXPORT_SYMBOL(smp_call_function);
 
+/*
+ * This function sends a 'generic call function' IPI to the specified CPU.
+ *
+ * [SUMMARY] Run a function on the specified CPUs.
+ * func The function to run. This must be fast and non-blocking.
+ * info An arbitrary pointer to pass to the function.
+ * nonatomic currently unused.
+ * wait If true, wait (atomically) until function has completed on the
+ * other CPU.
+ * [RETURNS] 0 on success, else a negative status code. Does not return until
+ * remote CPU is nearly ready to execute func or are or has executed.
+ *
+ * You must not call this function with disabled interrupts or from a
+ * hardware interrupt handler or from a bottom half handler.
+ */
+int smp_call_function_single(int cpuid, void (*func)(void *info), void *info,
+int nonatomic, int wait)
+{
+   struct call_data_struct data;
+   int ret = -1, cpus = 1, me;
+   u64 timeout;
+
+   /* Can deadlock when called with interrupts disabled */
+   WARN_ON(irqs_disabled());
+
+   if (unlikely(smp_ops == NULL))
+   return -1;
+
+   me = get_cpu(); /* prevent preemption and reschedule on another 
processor */
+   if (cpuid == me) {
+   printk(KERN_INFO %s: trying to call self\n, __FUNCTION__);
+   put_cpu();
+   return -EBUSY;
+   }
+
+   data.func = func;
+   data.info = info;
+   atomic_set(data.started, 0);
+   data.wait = wait;
+   if (wait)
+   atomic_set(data.finished, 0);
+
+   spin_lock(call_lock);
+
+   call_data = data;
+   smp_wmb();
+   /* Send a message to the specified CPU and wait for it to respond */
+   smp_ops-message_pass(cpuid, PPC_MSG_CALL_FUNCTION);
+
+   timeout = get_tb() + (u64) SMP_CALL_TIMEOUT * tb_ticks_per_sec;
+
+   /* Wait for response */
+   while (atomic_read(data.started) != cpus) {
+   HMT_low();
+   if (get_tb() = timeout) {
+   printk(%s on cpu %d: cpu %d not responding\n,
+  __FUNCTION__, smp_processor_id(), cpuid);
+   debugger(NULL);
+   goto out;
+   }
+   }
+
+   if (wait) {
+   while (atomic_read(data.finished) != cpus) {
+   HMT_low();
+   if (get_tb() = timeout) {
+   printk(%s on cpu %d: cpu %d not finishing\n,
+  __FUNCTION__, smp_processor_id(), cpuid);
+   debugger(NULL);
+ 

Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Arnd Bergmann
On Thursday 05 April 2007, Kevin Corry wrote:

 First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for a 
 function called smp_call_function_single(). However, this routine is only 
 implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently needs to 
 call this to run a function on a specific CPU. Powerpc provides an 
 smp_call_function() routine to run a function on all active CPUs, so I used 
 that as a basis to add an smp_call_function_single() routine. I've included 
 the patch below and was wondering if it looked like a sane approach.

The function itself looks good, but since it's very similar to the existing
smp_call_function(), you should probably try to share some of the code,
e.g. by making a helper function that gets an argument to decide whether
to run on a specific CPU or on all CPUs.

 Next, we ran into a problem related to Perfmon2 initialization and sysfs. The 
 problem turned out to be that the powerpc version of topology_init() is 
 defined as an __initcall() routine, but Perfmon2's initialization is done as 
 a subsys_initcall() routine. Thus, Perfmon2 tries to initialize its sysfs 
 information before some of the powerpc cpu information has been initialized. 
 However, on all other architectures, topology_init() is defined as a 
 subsys_initcall() routine, so this problem was not seen on any other 
 platforms. Changing the powerpc version of topology_init() to a 
 subsys_initcall() seems to have fixed the bug. However, I'm not sure if that 
 is going to cause problems elsewhere in the powerpc code. I've included the 
 patch below (after the smp-call-function-single patch). Does anyone know if 
 this change is safe, or if there was a specific reason that topology_init() 
 was left as an __initcall() on powerpc?

In general, it's better to do initcalls as late as possible, so __initcall()
is preferred over subsys_initcall() if both work. Have you tried doing it
the other way and starting perfmon2 from a regular __initcall()?

Arnd 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
On Thu April 5 2007 3:08 pm, Arnd Bergmann wrote:
 On Thursday 05 April 2007, Kevin Corry wrote:
  First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for
  a function called smp_call_function_single(). However, this routine is
  only implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently
  needs to call this to run a function on a specific CPU. Powerpc provides
  an smp_call_function() routine to run a function on all active CPUs, so I
  used that as a basis to add an smp_call_function_single() routine. I've
  included the patch below and was wondering if it looked like a sane
  approach.

 The function itself looks good, but since it's very similar to the existing
 smp_call_function(), you should probably try to share some of the code,
 e.g. by making a helper function that gets an argument to decide whether
 to run on a specific CPU or on all CPUs.

Ok. I'll see what I can come up with and post another patch today or tomorrow.


  Next, we ran into a problem related to Perfmon2 initialization and sysfs.
  The problem turned out to be that the powerpc version of topology_init()
  is defined as an __initcall() routine, but Perfmon2's initialization is
  done as a subsys_initcall() routine. Thus, Perfmon2 tries to initialize
  its sysfs information before some of the powerpc cpu information has been
  initialized. However, on all other architectures, topology_init() is
  defined as a subsys_initcall() routine, so this problem was not seen on
  any other platforms. Changing the powerpc version of topology_init() to a
  subsys_initcall() seems to have fixed the bug. However, I'm not sure if
  that is going to cause problems elsewhere in the powerpc code. I've
  included the patch below (after the smp-call-function-single patch). Does
  anyone know if this change is safe, or if there was a specific reason
  that topology_init() was left as an __initcall() on powerpc?

 In general, it's better to do initcalls as late as possible, so
 __initcall() is preferred over subsys_initcall() if both work. Have you
 tried doing it the other way and starting perfmon2 from a regular
 __initcall()?

For the moment, I made the change to topology_init() since it was the simplest 
fix to get things working. I have considered switching the perfmon2 
initialization to __initcall(), but there are apparently some timing issues 
with ensuring that the perfmon2 core code is initialized before any of its 
sub-modules. Since they could all be compiled statically in the kernel, I'm 
not sure if there's a way to ensure the ordering of calls within a single 
initcall level. I'll need to ask Stephane if there were any other reasons why 
subsys_initcall() was used for perfmon2.

Thanks, Arnd.
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Arnd Bergmann
On Thursday 05 April 2007, Kevin Corry wrote:
 For the moment, I made the change to topology_init() since it was the 
 simplest 
 fix to get things working. I have considered switching the perfmon2 
 initialization to __initcall(), but there are apparently some timing issues 
 with ensuring that the perfmon2 core code is initialized before any of its 
 sub-modules. Since they could all be compiled statically in the kernel, I'm 
 not sure if there's a way to ensure the ordering of calls within a single 
 initcall level. I'll need to ask Stephane if there were any other reasons why 
 subsys_initcall() was used for perfmon2.

If they all come from the same directory, you can simply order them in
the Makefile. If a module in arch/ needs to be initialized after one in
drivers/, that's not possible though, and changing topology_init() should
be the best option.

Arnd 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Benjamin Herrenschmidt
On Thu, 2007-04-05 at 14:55 -0500, Kevin Corry wrote:
 Hello,
 
 Carl Love and I have been working on getting the latest perfmon2 patches 
 (http://perfmon2.sourceforge.net/) working on Cell, and on powerpc in 
 general. We've come up with some powerpc-specific questions and we're hoping 
 to get some opinions from the powerpc kernel developers.
 
 First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for a 
 function called smp_call_function_single(). However, this routine is only 
 implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently needs to 
 call this to run a function on a specific CPU. Powerpc provides an 
 smp_call_function() routine to run a function on all active CPUs, so I used 
 that as a basis to add an smp_call_function_single() routine. I've included 
 the patch below and was wondering if it looked like a sane approach.

We should do better... it will require some backend work for the various
supported PICs though. I've always wanted to look into doing a
smp_call_function_cpumask in fact :-)

 Next, we ran into a problem related to Perfmon2 initialization and sysfs. The 
 problem turned out to be that the powerpc version of topology_init() is 
 defined as an __initcall() routine, but Perfmon2's initialization is done as 
 a subsys_initcall() routine. Thus, Perfmon2 tries to initialize its sysfs 
 information before some of the powerpc cpu information has been initialized. 
 However, on all other architectures, topology_init() is defined as a 
 subsys_initcall() routine, so this problem was not seen on any other 
 platforms. Changing the powerpc version of topology_init() to a 
 subsys_initcall() seems to have fixed the bug. However, I'm not sure if that 
 is going to cause problems elsewhere in the powerpc code. I've included the 
 patch below (after the smp-call-function-single patch). Does anyone know if 
 this change is safe, or if there was a specific reason that topology_init() 
 was left as an __initcall() on powerpc?

It would make sense to follow what other archs do. Note that if both
perfmon and topology_init are subsys_initcall, that is on the same
level, it's still a bit hairy to expect one to be called before the
other...

Ben.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
On Thu April 5 2007 3:32 pm, Kevin Corry wrote:
 On Thu April 5 2007 3:08 pm, Arnd Bergmann wrote:
  On Thursday 05 April 2007, Kevin Corry wrote:
   First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h
   for a function called smp_call_function_single(). However, this routine
   is only implemented on i386, x86_64, ia64, and mips. Perfmon2
   apparently needs to call this to run a function on a specific CPU.
   Powerpc provides an smp_call_function() routine to run a function on
   all active CPUs, so I used that as a basis to add an
   smp_call_function_single() routine. I've included the patch below and
   was wondering if it looked like a sane approach.
 
  The function itself looks good, but since it's very similar to the
  existing smp_call_function(), you should probably try to share some of
  the code, e.g. by making a helper function that gets an argument to
  decide whether to run on a specific CPU or on all CPUs.

 Ok. I'll see what I can come up with and post another patch today or
 tomorrow.

Here's a new version that adds smp_call_function_single(), and moves the
code that's shared with smp_call_function() to __smp_call_function().

Thanks,
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/


Add an smp_call_function_single() to the powerpc architecture. Since this
is very similar to the existing smp_call_function() routine, the common
portions have been split out into __smp_call_function(). Since the
spin_lock(call_lock) was moved to __smp_call_function(),
smp_call_function() now explicitly calls preempt_disable() before getting
the count of online CPUs.

Signed-off-by: Kevin Corry [EMAIL PROTECTED]

Index: linux-2.6.20-arnd3-perfmon/arch/powerpc/kernel/smp.c
===
--- linux-2.6.20-arnd3-perfmon.orig/arch/powerpc/kernel/smp.c
+++ linux-2.6.20-arnd3-perfmon/arch/powerpc/kernel/smp.c
@@ -198,26 +198,11 @@ static struct call_data_struct {
 /* delay of at least 8 seconds */
 #define SMP_CALL_TIMEOUT   8
 
-/*
- * This function sends a 'generic call function' IPI to all other CPUs
- * in the system.
- *
- * [SUMMARY] Run a function on all other CPUs.
- * func The function to run. This must be fast and non-blocking.
- * info An arbitrary pointer to pass to the function.
- * nonatomic currently unused.
- * wait If true, wait (atomically) until function has completed on other 
CPUs.
- * [RETURNS] 0 on success, else a negative status code. Does not return until
- * remote CPUs are nearly ready to execute func or are or have executed.
- *
- * You must not call this function with disabled interrupts or from a
- * hardware interrupt handler or from a bottom half handler.
- */
-int smp_call_function (void (*func) (void *info), void *info, int nonatomic,
-  int wait)
-{ 
+static int __smp_call_function(void (*func)(void *info), void *info,
+  int wait, int target_cpu, int num_cpus)
+{
struct call_data_struct data;
-   int ret = -1, cpus;
+   int ret = -1;
u64 timeout;
 
/* Can deadlock when called with interrupts disabled */
@@ -234,40 +219,33 @@ int smp_call_function (void (*func) (voi
atomic_set(data.finished, 0);
 
spin_lock(call_lock);
-   /* Must grab online cpu count with preempt disabled, otherwise
-* it can change. */
-   cpus = num_online_cpus() - 1;
-   if (!cpus) {
-   ret = 0;
-   goto out;
-   }
 
call_data = data;
smp_wmb();
/* Send a message to all other CPUs and wait for them to respond */
-   smp_ops-message_pass(MSG_ALL_BUT_SELF, PPC_MSG_CALL_FUNCTION);
+   smp_ops-message_pass(target_cpu, PPC_MSG_CALL_FUNCTION);
 
timeout = get_tb() + (u64) SMP_CALL_TIMEOUT * tb_ticks_per_sec;
 
/* Wait for response */
-   while (atomic_read(data.started) != cpus) {
+   while (atomic_read(data.started) != num_cpus) {
HMT_low();
if (get_tb() = timeout) {
-   printk(smp_call_function on cpu %d: other cpus not 
-  responding (%d)\n, smp_processor_id(),
-  atomic_read(data.started));
+   printk(%s on cpu %d: other cpus not 
+  responding (%d)\n, __FUNCTION__,
+  smp_processor_id(), atomic_read(data.started));
debugger(NULL);
goto out;
}
}
 
if (wait) {
-   while (atomic_read(data.finished) != cpus) {
+   while (atomic_read(data.finished) != num_cpus) {
HMT_low();
if (get_tb() = timeout) {
-   printk(smp_call_function on cpu %d: other 
-  cpus not finishing (%d/%d)\n,
-  

Re: Questions about porting perfmon2 to powerpc

2007-04-05 Thread Kevin Corry
On Thu April 5 2007 6:04 pm, Benjamin Herrenschmidt wrote:
 On Thu, 2007-04-05 at 14:55 -0500, Kevin Corry wrote:
  First, the stock 2.6.20 kernel has a prototype in include/linux/smp.h for
  a function called smp_call_function_single(). However, this routine is
  only implemented on i386, x86_64, ia64, and mips. Perfmon2 apparently
  needs to call this to run a function on a specific CPU. Powerpc provides
  an smp_call_function() routine to run a function on all active CPUs, so I
  used that as a basis to add an smp_call_function_single() routine. I've
  included the patch below and was wondering if it looked like a sane
  approach.

 We should do better... it will require some backend work for the various
 supported PICs though. I've always wanted to look into doing a 
 smp_call_function_cpumask in fact :-)

I was actually wondering about that myself today. It would seem like an 
smp_call_function() that takes a CPU mask would be much more flexible than 
either the current version or the new one that I proposed. However, that was 
a little more hacking that I was willing to do today on powerpc architecture 
code. :)

  Next, we ran into a problem related to Perfmon2 initialization and sysfs.
  The problem turned out to be that the powerpc version of topology_init()
  is defined as an __initcall() routine, but Perfmon2's initialization is
  done as a subsys_initcall() routine. Thus, Perfmon2 tries to initialize
  its sysfs information before some of the powerpc cpu information has been
  initialized. However, on all other architectures, topology_init() is
  defined as a subsys_initcall() routine, so this problem was not seen on
  any other platforms. Changing the powerpc version of topology_init() to a
  subsys_initcall() seems to have fixed the bug. However, I'm not sure if
  that is going to cause problems elsewhere in the powerpc code. I've
  included the patch below (after the smp-call-function-single patch). Does
  anyone know if this change is safe, or if there was a specific reason
  that topology_init() was left as an __initcall() on powerpc?

 It would make sense to follow what other archs do. Note that if both
 perfmon and topology_init are subsys_initcall, that is on the same
 level, it's still a bit hairy to expect one to be called before the
 other...

I wondered that as well, but based on what Arnd posted earlier (presumably 
about the kernel linking order), the topology_init() call, which is in the 
arch/ top-level directory, should occur before pfm_init(), which is in 
perfmon/, even if both are in the same initcall level.

Thanks,
-- 
Kevin Corry
[EMAIL PROTECTED]
http://www.ibm.com/linux/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/