Hi Stephane,

"stephane eranian" <eran...@googlemail.com> wrote on 01/08/2009 12:25:56 
PM:

> Corey,
> 
> 
> On Thu, Jan 8, 2009 at 8:58 PM, Corey Ashford
> <cjash...@linux.vnet.ibm.com> wrote:
> > Hello,
> >
> > It appears that I have forgotten to post a patch for this bug.  This 
was a
> > problem I had seen when booting the latest 2.6.28-rc6 kernel, where 
kernel
> > function tracing was causing r4 to be corrupted because the perfmon 
code was
> > called in the wrong order (pretty obscure bug!).
> >
> I remember that problem and I thought you had a patch already. Did I 
forget
> to apply it?

My memory is fuzzy.  When I looked in the current code, the change wasn't 
there, so I'm not sure what happened.  Maybe it's only in the perfmon3 
code?

> 
> > To elabortate just a little, this is assembler code where do_signal is
> > called with two register parameters - r3 and r4.  r4 is set up several 
lines
> > before the call to do_signal and is easy to miss.  The call to the 
perfmon
> > function, pfm_handle_work, doesn't have a second parameter, but if in 
the
> > process of calling pfm_handle_work, r4 changes (perhaps because it 
calls a
> > function with 2 or more register parameters), when the subseqent call 
to
> > do_signal is made, r4 will be corrupted.  With kernel tracing turned 
on, r4
> > was getting touched in the process of calling pfm_handle_work, causing
> > do_signal to cause the system to hang.
> >
> > In order to fix this, the simplest thing to do was to reverse the 
order of
> > the two calls so that no saving of r4 was needed.  pfm_handle_work is 
now
> > called after do_signal, eliminating any possibility for corruption of
> > parameter registers.
> >
> >
> > Please let me know if there are any issues with this patch.
> >
> Looks like the problem is related to the calling convention and caller
> vs. callee save
> registers. The proposed patch work around the problem, the real fix 
> would be to
> save and restore the r4 register which appears to be a caller save.
> But I'll take
> the patch for now.

That would be a solution, but since the call order doesn't matter between 
do_signal and pfm_handle_work, the two fixes are equivalent, and since 
this one is simpler, I think it's a bit better.  The only problem is that 
the calls are made in the other order on all of the other arches, and it's 
not completely obvious that the two orders are equivalent.

- Corey


> 
> 
------------------------------------------------------------------------------
> Check out the new SourceForge.net Marketplace.
> It is the best place to buy or sell services for
> just about anything Open Source.
> http://p.sf.net/sfu/Xq1LFB
> _______________________________________________
> perfmon2-devel mailing list
> perfmon2-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/perfmon2-devel


------------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to