Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-09 Thread Naveen N. Rao

Michael Ellerman wrote:

"Naveen N. Rao"  writes:
My earlier assumption was that we have other scenarios when we are in 
realmode (specifically with MSR_RI unset) where we won't be able to 
recover from a trap, during function tracing (*). I did a set of 
experiments yesterday to verify that, but I was not able to uncover 
any such scenarios with my brief testing. So, we seem to be 
functioning just fine while tracing realmode C code, except for KVM.


Hmm. If MSR_RI is clear then that should indicate that you can't recover
from an interrupt, typically because you'd lose state in SRR0/1. So I
would expect things to go badly in that case.


Yes, so it looks like we aren't calling into any C code (that isn't 
already annotated with 'notrace') with MSR_RI unset. At least, with the 
usual interrupt handling on powernv. I tested this by putting a 'trap' 
in the function tracer callback after the recursion test. This forces a 
trap for each function that we trace non-recursively.


There may be other paths where we do so, but it isn't as pervasive as I 
previously thought. So, we should be able to exclude those paths using 
the paca field, as and when we find them.


- Naveen




Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-09 Thread Michael Ellerman
"Naveen N. Rao"  writes:

> Michael Ellerman wrote:
>> "Naveen N. Rao"  writes:
>> 
>>> We can't take a trap in most parts of real mode code. Instead of adding
>>> the 'notrace' annotation to all C functions that can be invoked from
>>> real mode, detect that we are in real mode on ftrace entry and return
>>> back.
>>>
>>> Signed-off-by: Naveen N. Rao 
>>> ---
>>> This RFC only handles -mprofile-kernel to demonstrate the approach being 
>>> considered. We will need to handle other ftrace entry if we decide to 
>>> continue down this path.
>> 
>> Paul and I were talking about having a paca flag for this, ie.
>> paca->safe_to_ftrace (or whatever). I'm not sure if you've talked to
>> him and decided this is a better approach.
>> 
>> I guess I'm 50/50 on which is better, they both have pluses and minuses.
>
> Thanks, I hadn't spoken to Paul, but I now think that this is probably 
> the better approach to take.

OK.

> My earlier assumption was that we have other scenarios when we are in 
> realmode (specifically with MSR_RI unset) where we won't be able to 
> recover from a trap, during function tracing (*). I did a set of 
> experiments yesterday to verify that, but I was not able to uncover any 
> such scenarios with my brief testing. So, we seem to be functioning just 
> fine while tracing realmode C code, except for KVM.

Hmm. If MSR_RI is clear then that should indicate that you can't recover
from an interrupt, typically because you'd lose state in SRR0/1. So I
would expect things to go badly in that case.

But that's sort of orthogonal to real mode. Real mode is different and
we do need to be careful, but a blanket ban on tracing in real mode
might be too broad.

The problem with KVM is that you're running in real mode (MMU off), in
the host kernel, but with the MMU context of the guest loaded. So if you
do anything that turns the MMU on, like a WARN_ON(), then all of a
sudden you're running guest kernel code.

> As such, rather than blacklisting all realmode code, I think it is 
> better to be selective and just disable the tracer for KVM since we know 
> we can't take a trap there. We will be able to use the same approach if 
> we uncover additional scenarios where we can't use function tracing. I 
> will look at implementing a paca field for this purpose.

Thanks. I think it could work well.

We could also use it during early boot, ie. the flag starts out false,
and in the kexec/kdump path as well.

cheers


Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-09 Thread Naveen N. Rao

Steven Rostedt wrote:

On Thu, 08 Mar 2018 00:07:07 +0530
"Naveen N. Rao"  wrote:


Yes, that's negligible.
Though, to be honest, I will have to introduce a 'mfmsr' for the older 
-pg variant. I still think that the improved reliability far outweighs 
the minor slowdown there.


In that case, can you introduce a read_mostly variable that can be
tested before calling the mfmsr. Why punish normal ftrace tracing if
kvm is not enabled or running?

Both should probably have an #ifdef CONFIG_KVM encapsulating the code.


Agreed. My previous intent was to exclude all realmode code from ftrace, 
but I now think it is better to only exclude the KVM path. So, I will 
make it specific to KVM.


- Naveen




Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-09 Thread Naveen N. Rao

Michael Ellerman wrote:

"Naveen N. Rao"  writes:


We can't take a trap in most parts of real mode code. Instead of adding
the 'notrace' annotation to all C functions that can be invoked from
real mode, detect that we are in real mode on ftrace entry and return
back.

Signed-off-by: Naveen N. Rao 
---
This RFC only handles -mprofile-kernel to demonstrate the approach being 
considered. We will need to handle other ftrace entry if we decide to 
continue down this path.


Paul and I were talking about having a paca flag for this, ie.
paca->safe_to_ftrace (or whatever). I'm not sure if you've talked to
him and decided this is a better approach.

I guess I'm 50/50 on which is better, they both have pluses and minuses.


Thanks, I hadn't spoken to Paul, but I now think that this is probably 
the better approach to take.


My earlier assumption was that we have other scenarios when we are in 
realmode (specifically with MSR_RI unset) where we won't be able to 
recover from a trap, during function tracing (*). I did a set of 
experiments yesterday to verify that, but I was not able to uncover any 
such scenarios with my brief testing. So, we seem to be functioning just 
fine while tracing realmode C code, except for KVM.


As such, rather than blacklisting all realmode code, I think it is 
better to be selective and just disable the tracer for KVM since we know 
we can't take a trap there. We will be able to use the same approach if 
we uncover additional scenarios where we can't use function tracing. I 
will look at implementing a paca field for this purpose.


I also noticed that even with an unexpected timebase, we still seem to 
recover just fine with a simple change:


--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2629,8 +2629,8 @@ static noinline void
rb_handle_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
   struct rb_event_info *info)
{
-   WARN_ONCE(info->delta > (1ULL << 59),
- KERN_WARNING "Delta way too big! %llu ts=%llu write stamp = 
%llu\n%s",
+   if (info->delta > (1ULL << 59))
+   pr_warn_once("Delta way too big! %llu ts=%llu write stamp = 
%llu\n%s",
 (unsigned long long)info->delta,
 (unsigned long long)info->ts,
 (unsigned long long)cpu_buffer->write_stamp,


This allowed the virtual machine to boot and we were able to trace the 
rest of KVM C code. I only just did a boot test, so I'm not sure if 
there are other scenarios where things can go wrong.


Steve,
Would you be willing to accept a patch like the above? Since we seem to 
handle the larger delta just fine, I think the above change should be 
fine?


I will still work on excluding KVM C code from being traced, but the 
advantage with the above patch is that we will be able to trace KVM C 
code with a small change if necessary.



- Naveen

---
(*) putting on my kprobe hat




Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-07 Thread Michael Ellerman
"Naveen N. Rao"  writes:

> We can't take a trap in most parts of real mode code. Instead of adding
> the 'notrace' annotation to all C functions that can be invoked from
> real mode, detect that we are in real mode on ftrace entry and return
> back.
>
> Signed-off-by: Naveen N. Rao 
> ---
> This RFC only handles -mprofile-kernel to demonstrate the approach being 
> considered. We will need to handle other ftrace entry if we decide to 
> continue down this path.

Paul and I were talking about having a paca flag for this, ie.
paca->safe_to_ftrace (or whatever). I'm not sure if you've talked to
him and decided this is a better approach.

I guess I'm 50/50 on which is better, they both have pluses and minuses.

cheers

> diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S 
> b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
> index 3f3e81852422..ecc0e8e38ead 100644
> --- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
> +++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
> @@ -56,6 +56,21 @@ _GLOBAL(ftrace_caller)
>  
>   /* Load special regs for save below */
>   mfmsr   r8
> +
> + /* Only proceed if we are not in real mode and can take interrupts */
> + andi.   r9, r8, MSR_IR|MSR_DR|MSR_RI
> + cmpdi   r9, MSR_IR|MSR_DR|MSR_RI
> + beq 1f
> + mflrr8
> + mtctr   r8
> + REST_GPR(9, r1)
> + REST_GPR(8, r1)
> + addir1, r1, SWITCH_FRAME_SIZE
> + ld  r0, LRSAVE(r1)
> + mtlrr0
> + bctr
> +
> +1:
>   mfctr   r9
>   mfxer   r10
>   mfcrr11
> -- 
> 2.16.1


Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-07 Thread Steven Rostedt
On Thu, 08 Mar 2018 00:07:07 +0530
"Naveen N. Rao"  wrote:

> Yes, that's negligible.
> Though, to be honest, I will have to introduce a 'mfmsr' for the older 
> -pg variant. I still think that the improved reliability far outweighs 
> the minor slowdown there.

In that case, can you introduce a read_mostly variable that can be
tested before calling the mfmsr. Why punish normal ftrace tracing if
kvm is not enabled or running?

Both should probably have an #ifdef CONFIG_KVM encapsulating the code.

-- Steve



Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-07 Thread Naveen N. Rao

Hi Steve,

Steven Rostedt wrote:

On Wed,  7 Mar 2018 22:16:19 +0530
"Naveen N. Rao"  wrote:


We can't take a trap in most parts of real mode code. Instead of adding
the 'notrace' annotation to all C functions that can be invoked from
real mode, detect that we are in real mode on ftrace entry and return
back.

Signed-off-by: Naveen N. Rao 
---
This RFC only handles -mprofile-kernel to demonstrate the approach being 
considered. We will need to handle other ftrace entry if we decide to 
continue down this path.


I do prefer this trade off.


Great, thanks!





diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S 
b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index 3f3e81852422..ecc0e8e38ead 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -56,6 +56,21 @@ _GLOBAL(ftrace_caller)
 
 	/* Load special regs for save below */

mfmsr   r8
+
+   /* Only proceed if we are not in real mode and can take interrupts */
+   andi.   r9, r8, MSR_IR|MSR_DR|MSR_RI
+   cmpdi   r9, MSR_IR|MSR_DR|MSR_RI
+   beq 1f


OK, I assume this check and branch is negligible compared to the mfmsr
call?


Yes, that's negligible.
Though, to be honest, I will have to introduce a 'mfmsr' for the older 
-pg variant. I still think that the improved reliability far outweighs 
the minor slowdown there.


- Naveen




Re: [RFC PATCH 1/1] powerpc/ftrace: Exclude real mode code from

2018-03-07 Thread Steven Rostedt
On Wed,  7 Mar 2018 22:16:19 +0530
"Naveen N. Rao"  wrote:

> We can't take a trap in most parts of real mode code. Instead of adding
> the 'notrace' annotation to all C functions that can be invoked from
> real mode, detect that we are in real mode on ftrace entry and return
> back.
> 
> Signed-off-by: Naveen N. Rao 
> ---
> This RFC only handles -mprofile-kernel to demonstrate the approach being 
> considered. We will need to handle other ftrace entry if we decide to 
> continue down this path.

I do prefer this trade off.


> diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S 
> b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
> index 3f3e81852422..ecc0e8e38ead 100644
> --- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
> +++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
> @@ -56,6 +56,21 @@ _GLOBAL(ftrace_caller)
>  
>   /* Load special regs for save below */
>   mfmsr   r8
> +
> + /* Only proceed if we are not in real mode and can take interrupts */
> + andi.   r9, r8, MSR_IR|MSR_DR|MSR_RI
> + cmpdi   r9, MSR_IR|MSR_DR|MSR_RI
> + beq 1f

OK, I assume this check and branch is negligible compared to the mfmsr
call?

-- Steve


> + mflrr8
> + mtctr   r8
> + REST_GPR(9, r1)
> + REST_GPR(8, r1)
> + addir1, r1, SWITCH_FRAME_SIZE
> + ld  r0, LRSAVE(r1)
> + mtlrr0
> + bctr
> +
> +1:
>   mfctr   r9
>   mfxer   r10
>   mfcrr11