Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-19 Thread Ingo Molnar

* Steven Rostedt [EMAIL PROTECTED] wrote:

 On Wed, 19 Nov 2008, Paul Mackerras wrote:
 
  Steven Rostedt writes:
  
   Can I add your Acked-by: to all these patches that I submitted? I'm going 
   to recommit them with a consistent subject (all lower case ppc), but I'm 
   not going to change the patches themselves.
   
   Would you two be fine with that? Or at least one of you?
  
  My preference would be for the patches to go through the powerpc tree
  unless there is a good reason for them to go via another tree.
 
 I have no problem with that. The only thing is that we have a lot of 
 pending work still in the linux-tip tree, which you may need to pull 
 in to get these patches working. Well, there's two or three commits 
 in the generic code that I know the PPC code is dependent on.
 
 I could give you a list of commits in tip that need to go mainline 
 first before we can pull in the PPC changes. Then you could wait 
 till those changes make it into 29 and then you could push the PPC 
 modifications in from your tree.

note that this inserts a lot of (unnecessary) serialization and a 
window of non-testing - by all likelyhood this will delay ppc ftrace 
to v2.6.30 or later kernels.

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-19 Thread Paul Mackerras
Ingo Molnar writes:

 and it's all in flux (we are in the middle of the development cycle), 
 so i dont think it would be a good idea for you to pull those bits 
 into the powerpc tree.

Quite.  OK, it does sound like this stuff needs to live in your tree
for now, and from the diffstat it doesn't look like there is any
conflict with stuff in my tree yet.

 Maybe Steve could do the following trick: create a Linus -git based 
 branch that uses the new APIs but marks ppc's ftrace as depends 0 in 
 the powerpc Kconfig. (the new ftrace.c wont build)
 
 You could pull that tree into the powerpc tree and everything should 
 still work fine in PPC - sans ftrace.

Sounds like a reasonable idea, except that I think I'll delay pulling
that branch into my tree until I need to in order to resolve a
conflict - at least as far as my exported branches are concerned.

 In tip/tracing we'd merge it too (these commits will never be 
 rebased),

I do want to see the patches in their final form and have the
opportunity to give an acked-by before you declare the branch frozen.
Apart from that, sounds good.

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-19 Thread Steven Rostedt

On Wed, 19 Nov 2008, Ingo Molnar wrote:
 
 Maybe Steve could do the following trick: create a Linus -git based 
 branch that uses the new APIs but marks ppc's ftrace as depends 0 in 
 the powerpc Kconfig. (the new ftrace.c wont build)

There's only two generic commits that need to be added for the PowerPC 
code to work.

  ftrace: pass module struct to arch dynamic ftrace functions
  ftrace: allow NULL pointers in mcount_loc

I've already ported them to mainline to test PowerPC there.
Paul could use these two versions and keep ftrace in a separate branch in 
his tree. This way all the PowerPC code will be there, and actually can be 
tested. They may still hit the same bugs that we have fixed in tip, but 
those should all be minor, since any major bug is already in mainline or 
on its way.

 
 You could pull that tree into the powerpc tree and everything should 
 still work fine in PPC - sans ftrace.
 
 In tip/tracing we'd merge it too (these commits will never be 
 rebased), and we'd also remove the depends 0 from the powerpc Kconfig. 
 That sole change wont conflict with anything powerpc.
 
 It would all play out just fine in linux-next: we'd have both the 
 latest powerpc bits and the latest ftrace bits on powerpc. You could 
 test the non-ftrace impact of Steve's changes in the powerpc tree as 
 well and have it all part of your usual workflow.
 
 The 2.6.29 merge window would be trouble-free as well: since the 
 sha1's are the same, any of the trees can be merged upstream without 
 having to wait for the other one and without creating conflicts or 
 other trouble for the other one.
 
 Hm?

If Paul uses the ported two generic commits, then he would need to keep 
that in a separate branch that will never go upstream. He could pull in 
the other PowerPC patches and do as you said, keep the depend off.

I can make two branches that Paul could pull from. One would have this 
code disabled in the config, and just be the PowerPC port. Although, we 
would need to figure out the best way to keep it disabled. Disable it 
after the patches? The patches themselve enable it.

And then I could make another branch with the back port of the two commits 
that Paul would never push upstream. This would allow for the PowerPC guys 
to test the code in their tree without waiting for it. We just need to 
trust that Paul will not push those commits ;-)

Actually, I can change the subject of those commits to have at the 
beginning:

   NOT FOR UPSTREAM ftrace: 

What do you guys think?

-- Steve

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-19 Thread Paul Mackerras
Ingo Molnar writes:

 * Steven Rostedt [EMAIL PROTECTED] wrote:
 
  On Wed, 19 Nov 2008, Paul Mackerras wrote:
  
   Steven Rostedt writes:
   
Can I add your Acked-by: to all these patches that I submitted? I'm 
going 
to recommit them with a consistent subject (all lower case ppc), but 
I'm 
not going to change the patches themselves.

Would you two be fine with that? Or at least one of you?
   
   My preference would be for the patches to go through the powerpc tree
   unless there is a good reason for them to go via another tree.
  
  I have no problem with that. The only thing is that we have a lot of 
  pending work still in the linux-tip tree, which you may need to pull 
  in to get these patches working. Well, there's two or three commits 
  in the generic code that I know the PPC code is dependent on.
  
  I could give you a list of commits in tip that need to go mainline 
  first before we can pull in the PPC changes. Then you could wait 
  till those changes make it into 29 and then you could push the PPC 
  modifications in from your tree.
 
 note that this inserts a lot of (unnecessary) serialization and a 
 window of non-testing - by all likelyhood this will delay ppc ftrace 
 to v2.6.30 or later kernels.

Well, note that I said unless there is a good reason.  If it does
need to go via your tree, it can, though I don't see that it will get
much testing on powerpc there, and having it there will make it harder
to manage any conflicts with the other stuff I have queued up.

How much generic stuff that's not upstream do the powerpc ftrace
patches depend on?

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-18 Thread Steven Rostedt

Paul and Benjamin,

Can I add your Acked-by: to all these patches that I submitted? I'm going 
to recommit them with a consistent subject (all lower case ppc), but I'm 
not going to change the patches themselves.

Would you two be fine with that? Or at least one of you?

-- Steve


___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-18 Thread Paul Mackerras
Steven Rostedt writes:

 Can I add your Acked-by: to all these patches that I submitted? I'm going 
 to recommit them with a consistent subject (all lower case ppc), but I'm 
 not going to change the patches themselves.
 
 Would you two be fine with that? Or at least one of you?

My preference would be for the patches to go through the powerpc tree
unless there is a good reason for them to go via another tree.

The style we use for the headline is powerpc: Add xyz feature or
powerpc/subsystem: Fix foo bug.

As for the acked-by, I feel I first need to go through the whole
series again with the changes you have made recently.  Have you
reworked the earlier patches to avoid introducing any bugs, rather
than just fixing the bugs with later patches?  If you haven't, are you
sure that the bugs won't cause any problems in bisecting?

Also, what's the best place to find the latest patch set?

Thanks again for doing all this work.

Paul.
  
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-18 Thread Steven Rostedt

On Wed, 19 Nov 2008, Paul Mackerras wrote:

 Steven Rostedt writes:
 
  Can I add your Acked-by: to all these patches that I submitted? I'm going 
  to recommit them with a consistent subject (all lower case ppc), but I'm 
  not going to change the patches themselves.
  
  Would you two be fine with that? Or at least one of you?
 
 My preference would be for the patches to go through the powerpc tree
 unless there is a good reason for them to go via another tree.

I have no problem with that. The only thing is that we have a lot of
pending work still in the linux-tip tree, which you may need to pull
in to get these patches working. Well, there's two or three commits in the 
generic code that I know the PPC code is dependent on.

I could give you a list of commits in tip that need to go mainline first
before we can pull in the PPC changes. Then you could wait till those 
changes make it into 29 and then you could push the PPC modifications in 
from your tree.

 
 The style we use for the headline is powerpc: Add xyz feature or
 powerpc/subsystem: Fix foo bug.

We've been using the ftrace: subject format for most of our ftrace 
commits, and have been using ftrace, ppc: subject or ppc32 or ppc64 for 
those. But since this is a powerpc port, I will conform to your style.

 
 As for the acked-by, I feel I first need to go through the whole
 series again with the changes you have made recently.  Have you
 reworked the earlier patches to avoid introducing any bugs, rather
 than just fixing the bugs with later patches?  If you haven't, are you
 sure that the bugs won't cause any problems in bisecting?

Fair enough. I'll rework the patches again to fold back in the changes 
based on your comments, as well as the comments of others. When I'm done, 
I'll repost to the list. This way, if you pull them in, you can add your 
Signed-off-by yourself.

 
 Also, what's the best place to find the latest patch set?

I'll be keeping the changes in my repo:

 git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git

  branch tip/ppc

Note, this branch is based on top of Ingo's linux-tip tree. I can make a 
a branch ppc that will be based on top of mainline, and I can add the 
patches needed to get PPC working. The generic commits will have the 
ftrace:  format.

The generic ftrace patches will still need to go in through linux-tip. But 
when they do, feel free to push the PowerPC port in.  Anyway, you are the 
ones that can test it better than we can. I have two PPC boxes that I test 
on, but I'm sure you have a lot more.

 
 Thanks again for doing all this work.

Thank you for the reviews.

-- Steve

___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-17 Thread Steven Rostedt

On Mon, 17 Nov 2008, Paul Mackerras wrote:

 Steven Rostedt writes:
 
  The following patches are for my work on porting the new dynamic ftrace
  framework to PowerPC.  The issue I had with both PPC64 and PPC32 is
  that the calls to mcount are 24 bit jumps. Since the modules are
  loaded in vmalloc address space, the call to mcount is farther than
  what a 24 bit jump can make. The way PPC solves this is with the use
  of trampolines. The trampoline is a memory space allocated within the
  24 bit region of the module. The code in the trampoline that the
  jump is made to does a far jump to the core kernel code.
 
 Thanks for doing this work.  I'll go through the patches in detail
 today, but first I'd like to clear up a couple of things for you.  The
 first is that unconditional branches on PowerPC effectively have a
 26-bit sign-extended offset, not 24-bit.  The offset field in the
 instruction is 24 bits long, but because all instructions are 4 bytes
 long, two extra 0 bits get appended to the offset field, giving a
 26-bit offset and a range of +/- 32MB from the branch instruction.

Ah yes, thanks for the clarification.

 
  PPC64, although works with 64 bit registers, the op codes are still
  32 bit in length.  PPC64 uses table of contents (TOC) fields
  to make their calls to functions. A function name is really a pointer
  into the TOC table that stores the actual address of the function
  along with the TOC of that function. The r2 register plays as the
  TOC pointer. The actual name of the function is the function name
  with a dot '.' prefix.  The reference name schedule is really
  to the TOC entry, which calls the actual code with the reference
  name .schedule. This also explains why the list of available filter
  functions on PPC64 all have a dot prefix.
 
 A little more detail: the TOC mainly stores addresses and other
 constants.  Functions have a descriptor that is stored in a .opd
 section (not the TOC, though the TOC may contain pointers to procedure
 descriptors).  Each descriptor has the address of the code, the
 address of the TOC for the function, and a static chain pointer (not
 used for C, but can used for other languages).  As you note, the
 symbol for a function will be the address of the descriptor rather
 than the address of the function code.
 
  When a funtion is called, it uses the 'bl' command which is a 24
  bit function jump (saving the return address in the link register).
  The next operation after all 'bl' calls is a nop. What the module
  load code does when one of these 'bl' calls is farther than 24 bits
  can handle, it creates a entry in the TOC and has the 'bl' call to
 
 The module loader allocates some memory for these trampolines, but
 that's a distinct area from the TOC and the OPD section.

Ah, yes, my mistake. It is a trampoline entry, not part of the TOC.

 
  that entry. The entry in the TOC will save the r2 register on the
  stack 40(r1) load the actually function into the ctrl register
 
 counter register, actually, not ctrl.

Oops, I still make that mistake :-/  I use to do a lot of PPC work several 
years ago, and I would always call that the control register, and my 
colleagues would always correct me and say its the counter register. I 
guess some things just don't change ;-)


 
  The work for PPC32 is very much the same as the PPC64 code but the 32
  version does not need to deal with TOCS. This makes the code much
  simpler. Pretty much everything as PPC64 is done, except it does not
  need to index a TOC.
 
 Right.
 
  I've tested the following patches on both PPC64 and PPC32. I will
  admit that the PPC64 does not seem that stable, but neither does the
  code when all this is not enabled ;-)  I'll debug it more to see if
  I can find the cause of my crashes, which may or may not be related
  to the dynamic ftrace code. But the use of TOCS in PPC64 make me
  a bit nervious that I did not do this correctly. Any help in reviewing
  my code for mistakes would be greatly appreciated.
 
 Hmmm.  What sort of crashes are you seeing?

This code is in tip, which is mainly used to develop for x86. I've hit a 
few crashes, and I think I hit a couple without this code. But here's an 
example:

huh, entered softirq 4 c0846ad8 preempt_count 1103, exited 
with fffefffe?
[ cut here ]
Badness at kernel/sched_fair.c:875
NIP: c004bfb8 LR: c004bf7c CTR: c00b5830
REGS: c0003929cce0 TRAP: 0700   Not tainted  (2.6.28-rc4-tip)
MSR: 90021032 ME,IR,DR  CR: 28822842  XER: 2000
TASK = c0003d93cd10[2061] 'remove-trailing' THREAD: c0003929c000 
CPU: 1
GPR00: 0001 c0003929cf60 c0887070 c0ae2d00 
GPR04: c004c2c0 3320 c0003929cb70 080d 
GPR08: c079333c 0002 c0903380 c0903380 
GPR12: 48822848 c0903580 c0794000  
GPR16: c0903380 0001 

Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-17 Thread Steven Rostedt

On Mon, 17 Nov 2008, Steven Rostedt wrote:
 On Mon, 17 Nov 2008, Paul Mackerras wrote:
  
   I've tested the following patches on both PPC64 and PPC32. I will
   admit that the PPC64 does not seem that stable, but neither does the
   code when all this is not enabled ;-)  I'll debug it more to see if
   I can find the cause of my crashes, which may or may not be related
   to the dynamic ftrace code. But the use of TOCS in PPC64 make me
   a bit nervious that I did not do this correctly. Any help in reviewing
   my code for mistakes would be greatly appreciated.
  
  Hmmm.  What sort of crashes are you seeing?
 
 This code is in tip, which is mainly used to develop for x86. I've hit a 
 few crashes, and I think I hit a couple without this code. But here's an 
 example:
 
 huh, entered softirq 4 c0846ad8 preempt_count 1103, exited 
 with fffefffe?
 [ cut here ]
 Badness at kernel/sched_fair.c:875
 NIP: c004bfb8 LR: c004bf7c CTR: c00b5830
 REGS: c0003929cce0 TRAP: 0700   Not tainted  (2.6.28-rc4-tip)
 MSR: 90021032 ME,IR,DR  CR: 28822842  XER: 2000
 TASK = c0003d93cd10[2061] 'remove-trailing' THREAD: c0003929c000 
 CPU: 1
 GPR00: 0001 c0003929cf60 c0887070 c0ae2d00 
 GPR04: c004c2c0 3320 c0003929cb70 080d 
 GPR08: c079333c 0002 c0903380 c0903380 
 GPR12: 48822848 c0903580 c0794000  
 GPR16: c0903380 0001 c0909f7c 7fff 
 GPR20: c0003929d8e0 c0ae2f20 0086b6e84cc0 0001 
 GPR24: 0001 c0794000 c0003d93cd10 c0003d934f20 
 GPR28: c0ae4000 c0003d93cd48 c0803550 c0003929cf60 
 cpu 0x1: Vector: 400 (Instruction Access) at [c0003929be1f]
 pc: 01c00ae8
 lr: 01c00aeb
 sp: c0003929c09f
msr: 900040001032
   current = 0xc0003d93cd10
   paca= 0xc0903580
 pid   = 2061, comm = remove-trailing
 
 
 Then it went into the monitor that is loaded. When I fix the rest of my 
 patches, I'll see if it is not my code that is crashing this, and then 
 I'll see if I can figure out what is causing some of these crashes.
 


Paul,

I'm thinking that I'm hitting stack overflows. I just got this crash:

Badness at net/core/skbuff.c:393
NIP: c04c805c LR: c04c800c CTR: c00b57d0
[ cut here ]
kernel BUG at kernel/sched.c:1155!
cpu 0x1: Vector: 700 (Program Check) at [c0002ad18500]
pc: c0049884: .resched_task+0x54/0xe0
lr: c0049858: .resched_task+0x28/0xe0
sp: c0002ad18780
   msr: 90021032
  current = 0xc0002a516c30
  paca= 0xc0903580
pid   = 17578, comm = fixdep
kernel BUG at kernel/sched.c:1155!
enter ? for help
[c0002ad18810] c0054494 .task_tick_fair+0xc4/0x120
[c0002ad188a0] c0056188 .scheduler_tick+0x108/0x2d0
[c0002ad18940] c006b1d4 .update_process_times+0x74/0xb0
[c0002ad189e0] c008bb4c .tick_sched_timer+0x8c/0x120
[c0002ad18a90] c0080f88 .__run_hrtimer+0xd8/0x130
[c0002ad18b30] c0081fac .hrtimer_interrupt+0x16c/0x220
[c0002ad18c20] c0023a0c .timer_interrupt+0xcc/0x110
[c0002ad18cc0] c00034e0 decrementer_common+0xe0/0x100
--- Exception: 901 (Decrementer) at c000b978 
.raw_local_irq_restore+0x58/0x60
[link register   ] c005b8d8 .vprintk+0x318/0x4b0
[c0002ad18fb0] c005b7e0 .vprintk+0x220/0x4b0 (unreliable)
[c0002ad19110] c005bac4 .printk+0x54/0x70
[c0002ad191b0] c00119d0 .show_regs+0x50/0x380
[c0002ad19260] c0244150 .report_bug+0xb0/0x130
[c0002ad19300] c00253c0 .program_check_exception+0x1e0/0x610
[c0002ad19390] c00043e0 program_check_common+0xe0/0x100
--- Exception: 700 (Program Check) at c04c805c 
.skb_release_head_state+0x7c/0xe0
[c0002ad19710] c04c8f0c .skb_release_all+0x2c/0x50
[c0002ad197a0] c04c80f4 .__kfree_skb+0x34/0x120
[c0002ad19830] c04c8224 .kfree_skb+0x44/0x90
[c0002ad198c0] c04d31f4 .dev_hard_start_xmit+0x224/0x390
[c0002ad19980] c04ed1d0 .__qdisc_run+0x240/0x340
[c0002ad19a50] c04d7778 .dev_queue_xmit+0x328/0x630
[c0002ad19b00] c0501940 .ip_finish_output+0x160/0x3d0
[c0002ad19bc0] c05022c4 .ip_output+0xc4/0xf0
[c0002ad19c60] c0501c88 .ip_local_out+0x58/0x80
[c0002ad19cf0] c0502824 .ip_queue_xmit+0x254/0x480
[c0002ad19e00] c051ace8 .tcp_transmit_skb+0x498/0x970
[c0002ad19ef0] c051cf98 .__tcp_push_pending_frames+0x248/0x9a0
[c0002ad19fe0] c0518ec0 .tcp_rcv_established+0x180/0x710
[c0002ad1a0a0] c05219cc .tcp_v4_do_rcv+0x11c/0x2b0
[c0002ad1a160] 

[PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-16 Thread Steven Rostedt

The following patches are for my work on porting the new dynamic ftrace
framework to PowerPC.  The issue I had with both PPC64 and PPC32 is
that the calls to mcount are 24 bit jumps. Since the modules are
loaded in vmalloc address space, the call to mcount is farther than
what a 24 bit jump can make. The way PPC solves this is with the use
of trampolines. The trampoline is a memory space allocated within the
24 bit region of the module. The code in the trampoline that the
jump is made to does a far jump to the core kernel code.

The way PPC64 implements this is slightly different than the way
PPC32 does. Since my PPC64 box has a serial port it makes developing
and debugging easier, so my first patches port to PPC64, and then
the later patches include the work to get PPC32 working.

I'm describing what both PPC archs do in a bit of detail so that the
PPC exports CC'd can tell me if I'm incorrect. I did not read any
PPC specs to find out what was happening, I only reviewed the existing
PPC code that was in Linux.

The PPC64 dynamic ftrace:

PPC64, although works with 64 bit registers, the op codes are still
32 bit in length.  PPC64 uses table of contents (TOC) fields
to make their calls to functions. A function name is really a pointer
into the TOC table that stores the actual address of the function
along with the TOC of that function. The r2 register plays as the
TOC pointer. The actual name of the function is the function name
with a dot '.' prefix.  The reference name schedule is really
to the TOC entry, which calls the actual code with the reference
name .schedule. This also explains why the list of available filter
functions on PPC64 all have a dot prefix.

When a funtion is called, it uses the 'bl' command which is a 24
bit function jump (saving the return address in the link register).
The next operation after all 'bl' calls is a nop. What the module
load code does when one of these 'bl' calls is farther than 24 bits
can handle, it creates a entry in the TOC and has the 'bl' call to
that entry. The entry in the TOC will save the r2 register on the
stack 40(r1) load the actually function into the ctrl register
and make the far jump using that register (I'm using the term
'far' to mean more than 24 bits, nothing to do with the x86 far jumps
that deal with segments).  The module linker also modifies the
nop after the 'bl' call in the module into an op code that will restore
the r2 register 'ld r2,40(r1)'.

Now for what we need to do with dynamic ftrace on PPC64:

Dynamic ftrace needs to convert these calls to mcount into nops. It
also needs to be able to convert them back into a call to the
ftrace_caller (mcount is just a stub, the actual function recording
is done by a different function called ftrace_caller).

Before the dynamic ftrace modifies any code, it first makes sure
what it is changing is indeed what it expects it to be. This means
the dynamic ftrace code for PPC64 must be fully aware of the module
trampolines. When a mcount call is farther than 24 bits, it
takes a look at where that mcount call is at. The call should be into
an entry in the TOC, and the dynamic ftrace code reads the entry
that the call points to. It makes sure that the entry will make a
call to mcount (otherwise it returns failure).

After verifying that the 'bl' call calls into the TOC that calls
mcount, it converts that 'bl' call into a nop. It also converts
the following op code (the load of r2) into a nop since the TOC
is no longer saved. It does test that the following op is a load
of r2 before making any of the above changes. It also stores the
module structure pointer into the dyn_ftrace record field, for later
use.

On enabling the call back to ftrace_caller, the dynamic ftrace code
first verifys that the two op codes are two nops. It then reads
the dyn_ftrace structure module pointer to find the TOC and the entry
for the ftrace_caller (the ftrace_caller is added to the module
TOC on module load).  It then changes the call to the ftrace_caller
and the op that reloads the r2 register.

Note, to disable the ftrace_caller, the same as the disabling of
mcount is done, but instead it verifies that the TOC entry calls
the ftrace_caller and not mcount.


The PPC32 dynamic ftrace:

The work for PPC32 is very much the same as the PPC64 code but the 32
version does not need to deal with TOCS. This makes the code much
simpler. Pretty much everything as PPC64 is done, except it does not
need to index a TOC.

To disable mcount (or ftrace_caller):

If the call is greater than 24 bits, it looks to see where the 'bl'
jumps to. It verifies that the trampoline that it jumps to makes the
call to 'mcount' (or ftrace_caller if that is what is expected).
It then simply converts the 'bl' to a nop.

To enable ftrace_caller:

The 'bl' is converted to jump to the ftrace_caller trampoline entry
that was created on module load.

I've tested the following patches on both PPC64 and PPC32. I will
admit that the PPC64 does not seem that stable, 

Re: [PATCH 0/7] Porting dynmaic ftrace to PowerPC

2008-11-16 Thread Paul Mackerras
Steven Rostedt writes:

 The following patches are for my work on porting the new dynamic ftrace
 framework to PowerPC.  The issue I had with both PPC64 and PPC32 is
 that the calls to mcount are 24 bit jumps. Since the modules are
 loaded in vmalloc address space, the call to mcount is farther than
 what a 24 bit jump can make. The way PPC solves this is with the use
 of trampolines. The trampoline is a memory space allocated within the
 24 bit region of the module. The code in the trampoline that the
 jump is made to does a far jump to the core kernel code.

Thanks for doing this work.  I'll go through the patches in detail
today, but first I'd like to clear up a couple of things for you.  The
first is that unconditional branches on PowerPC effectively have a
26-bit sign-extended offset, not 24-bit.  The offset field in the
instruction is 24 bits long, but because all instructions are 4 bytes
long, two extra 0 bits get appended to the offset field, giving a
26-bit offset and a range of +/- 32MB from the branch instruction.

 PPC64, although works with 64 bit registers, the op codes are still
 32 bit in length.  PPC64 uses table of contents (TOC) fields
 to make their calls to functions. A function name is really a pointer
 into the TOC table that stores the actual address of the function
 along with the TOC of that function. The r2 register plays as the
 TOC pointer. The actual name of the function is the function name
 with a dot '.' prefix.  The reference name schedule is really
 to the TOC entry, which calls the actual code with the reference
 name .schedule. This also explains why the list of available filter
 functions on PPC64 all have a dot prefix.

A little more detail: the TOC mainly stores addresses and other
constants.  Functions have a descriptor that is stored in a .opd
section (not the TOC, though the TOC may contain pointers to procedure
descriptors).  Each descriptor has the address of the code, the
address of the TOC for the function, and a static chain pointer (not
used for C, but can used for other languages).  As you note, the
symbol for a function will be the address of the descriptor rather
than the address of the function code.

 When a funtion is called, it uses the 'bl' command which is a 24
 bit function jump (saving the return address in the link register).
 The next operation after all 'bl' calls is a nop. What the module
 load code does when one of these 'bl' calls is farther than 24 bits
 can handle, it creates a entry in the TOC and has the 'bl' call to

The module loader allocates some memory for these trampolines, but
that's a distinct area from the TOC and the OPD section.

 that entry. The entry in the TOC will save the r2 register on the
 stack 40(r1) load the actually function into the ctrl register

counter register, actually, not ctrl.

 The work for PPC32 is very much the same as the PPC64 code but the 32
 version does not need to deal with TOCS. This makes the code much
 simpler. Pretty much everything as PPC64 is done, except it does not
 need to index a TOC.

Right.

 I've tested the following patches on both PPC64 and PPC32. I will
 admit that the PPC64 does not seem that stable, but neither does the
 code when all this is not enabled ;-)  I'll debug it more to see if
 I can find the cause of my crashes, which may or may not be related
 to the dynamic ftrace code. But the use of TOCS in PPC64 make me
 a bit nervious that I did not do this correctly. Any help in reviewing
 my code for mistakes would be greatly appreciated.

Hmmm.  What sort of crashes are you seeing?

Regards,
Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev