Hi Stephane, "stephane eranian" <eran...@googlemail.com> wrote on 01/08/2009 12:25:56 PM:
> Corey, > > > On Thu, Jan 8, 2009 at 8:58 PM, Corey Ashford > <cjash...@linux.vnet.ibm.com> wrote: > > Hello, > > > > It appears that I have forgotten to post a patch for this bug. This was a > > problem I had seen when booting the latest 2.6.28-rc6 kernel, where kernel > > function tracing was causing r4 to be corrupted because the perfmon code was > > called in the wrong order (pretty obscure bug!). > > > I remember that problem and I thought you had a patch already. Did I forget > to apply it? My memory is fuzzy. When I looked in the current code, the change wasn't there, so I'm not sure what happened. Maybe it's only in the perfmon3 code? > > > To elabortate just a little, this is assembler code where do_signal is > > called with two register parameters - r3 and r4. r4 is set up several lines > > before the call to do_signal and is easy to miss. The call to the perfmon > > function, pfm_handle_work, doesn't have a second parameter, but if in the > > process of calling pfm_handle_work, r4 changes (perhaps because it calls a > > function with 2 or more register parameters), when the subseqent call to > > do_signal is made, r4 will be corrupted. With kernel tracing turned on, r4 > > was getting touched in the process of calling pfm_handle_work, causing > > do_signal to cause the system to hang. > > > > In order to fix this, the simplest thing to do was to reverse the order of > > the two calls so that no saving of r4 was needed. pfm_handle_work is now > > called after do_signal, eliminating any possibility for corruption of > > parameter registers. > > > > > > Please let me know if there are any issues with this patch. > > > Looks like the problem is related to the calling convention and caller > vs. callee save > registers. The proposed patch work around the problem, the real fix > would be to > save and restore the r4 register which appears to be a caller save. > But I'll take > the patch for now. That would be a solution, but since the call order doesn't matter between do_signal and pfm_handle_work, the two fixes are equivalent, and since this one is simpler, I think it's a bit better. The only problem is that the calls are made in the other order on all of the other arches, and it's not completely obvious that the two orders are equivalent. - Corey > > ------------------------------------------------------------------------------ > Check out the new SourceForge.net Marketplace. > It is the best place to buy or sell services for > just about anything Open Source. > http://p.sf.net/sfu/Xq1LFB > _______________________________________________ > perfmon2-devel mailing list > perfmon2-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/perfmon2-devel ------------------------------------------------------------------------------ Check out the new SourceForge.net Marketplace. It is the best place to buy or sell services for just about anything Open Source. http://p.sf.net/sfu/Xq1LFB _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel